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Preface 


This book is a second edition of the book of the same title by the first author 
which was published in 2000. The subject of ruin probabilities and related top- 
ics has since then undergone a considerable development, not to say boom. This 
much expanded and revised second edition aims at covering a substantial part 
of these developments as well as the classical topics. 


Risk theory in general and ruin probabilities in particular are traditionally 
considered as part of insurance mathematics, and has been an active area of 
research from the days of Lundberg all the way up to today. One reason for 
writing this book is a feeling that the area has in recent years achieved a con- 
siderable mathematical maturity, which has in particular removed one of the 
standard criticisms of the area, namely that it can only say something about 
very simple models and questions. Although in insurance practice, usually sim- 
pler (and coarser) risk measures like Value-at-Risk are used, it is widely believed 
that the thinking advocated by ruin theory is still important for modern risk 
management. For instance, in times of market-consistent valuation principles, 
the role of the time diversification effect of insurance portfolios, which is one of 
the core elements of ruin theory, should not be forgotten. In addition, ruin the- 
ory has fruitful methodological links and applications to other fields of applied 
probability, like queueing theory and mathematical finance (pricing of barrier 
options, credit products etc.). Apart from these remarks, we have deliberately 
stayed away from discussing the practical relevance of the theory; if the formu- 
lations occasionally give a different impression, it is not by intention. Thus, the 
book is basically mathematical in its flavor. 


The present second edition is more than 50% longer than the first and has 
more than double the number of references. The longer parts of the new mate- 
rial, reflecting subareas that have been particularly active in the last decade, are 
collected in Chapters XI-XIV, which treat Lévy processes, Gerber-Shiu func- 
tions, dependence and stochastic control, respectively. Shorter additions include 
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more about martingales and generators (II.4), various versions in Chapter VIII 
of models with level dependence, e.g. tax or stochastic investments, Erlangiza- 
tion (IX.8), statistical techniques for distinguishing between light and heavy 
tails (X.6), more material on discrete-time risk models (XVI.1) and recent ad- 
vances in simulation techniques scattered in Chapter XV. In addition, there are 
amendments and updates at a large number of places. 


A book like this can be organized in many ways. One is by model, another 
by method. The present book is somewhere between these two possibilities. 
Chapters IV-VIII introduce some of the main models and give a first derivation 
of some of their properties. Chapters IX-XV then go into more depth with 
some of the special approaches for analyzing specific models and add a number 
of results on the models in Chapters IV-VIII. Chapters II and III are essentially 
methodological in flavor. 

Here is a suggestion on how to get started with the book. For a brief ori- 
entation, read first Chapter I, continue with II.1-3 to see some of the simplest 
ruin calculations, the first part of III.5 (to understand the Pollaczeck-Khinchine 
formula in IV.2 more properly), and then, to get acquainted with the classical 
theory of the Cramér-Lundberg model, IV.1-5, V.4a, VIII.1, [X.1-3 and X.1- 
2. For a second reading, incorporate II.4, III.1-3, IV.8-9, V.1-2, V.5, VII.1-3, 
VII.2, X.3-4, XII.1-2, XIII.1-2 and XV.1-3. The rest is up to your specific 
interests. Enjoy! 


The symbols used for the quantities appearing in the book differ among the 
disciplines. We chose to use those that are common in the queueing community, 
partly also to be in line with the first edition. We apologize for the confusion 
this may cause for readers who are used to other symbols. In a book project 
like this it is impossible to avoid conflicts of notation in the sense that the same 
symbol may be used for different quantities. We hope that it will always be clear 
from the context what the notation refers to. In addition, we have collected a 
number of conventions, abbreviations and symbols after this Preface. 

We have tried to be fairly exhaustive in citing references close to the text, 
but it is obvious that such a system involves a number of inconsistencies and 
omissions, for which we apologize to the reader and to the authors of the many 
papers that ought to have been on the list. 

We intend to keep a list of misprints and remarks posted on the web page 


http://www.hec.unil.ch/halbrecher/rp2.html 
and we are grateful to get relevant material sent by email to 


hansjoerg.albrecher@unil.ch 
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Finally, we would like to thank Corina Constantinescu, Hans Gerber, Peter 
Glynn, Dominik Kortschak, Ronnie Loeffen, Stefan Thonhauser and Hailiang 
Yang for discussions and proofreading parts of the manuscript, and Dominik 
Kortschak for help with some figures and general LaTeX issues. 

Most of all, we would like to thank our wives May Lise and Renate for their 
support and patience during the writing of this book. 


Aarhus and Lausanne, May 2010 


Søren Asmussen Hansjörg Albrecher 
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Notation and conventions 


Numbering and reference system 


The chapter number is specified only when it is not the current one. Thus 
Proposition 4.2, formula (5.3) or Section 3 of Chapter VI are referred to 
as Proposition VI.4.2, formula VI.(5.3) and Section VI.3 (or just VI.3), 
respectively, in all other chapters whereas in VI we just write Proposition 
4.2, formula (5.3) or Section 3. References like Proposition A.4, (A.29) 
refer to the Appendix. 


Throughout the book, [APQ] refers to the first author’s earlier book Ap- 
plied Probability and Queues, reference [69]. 


Abbreviations 


a.s. almost surely 

c.d.f. cumulative distribution function P(X < zx) 

c.g.f. cumulant generating function, i.e. log B[s] where B[s] is the m.g.f. 
IDE integro-differential equation 

i.i.d. independent identically distributed 

i.o. infinitely often 

Lh.s. left-hand side (of equation) 

m.g.f. moment generating function, see under B[s] below. 

ODE ordinary differential equation 

r.h.s. right-hand side (of equation) 


r.v. random variable 


s.c.v. squared coefficient of variation, EX?/(EX)?. 
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xiv NOTATION AND CONVENTIONS 


w.r.t. with respect to 


w.p. with probability 


Mathematical notation 


g 


probability. 


i expectation. 


~ Used in asymptotic relations to indicate that the ratio between two 
expressions is 1 in the limit. E.g. n! ~ vV2r n”t!/2e7”, n > oo. 


x A different type of asymptotics: less precise, say a heuristic approxi- 
mation, or a more precise one like e” ~ 1 +h + h?/2, h — 0. 

$ Used in asymptotic relations to indicate that the ratio between loga- 
rithms of two expressions is 1. 

<st stochastic order. 

ex Convex order. 

<icx increasing convex order (i.e. stop-loss order). 

<gm supermodular order. 

R(s) the real part of a complex number s. 

— The same symbol B is used for a probability measure B(dx) = P(X € 
dz) and its c.d.f. B(x) = P(X < x) = f*,, B(dy). 


Bir] the m.g.f. f° e"® B(dz) of the distribution B. If, as for typical claim 


aA 


size distributions, B is concentrated on [0,00), B[r] is always defined 
if R(r) < 0 and sometimes in a larger strip (for example, if B(x) ~ 
ce~®”, then for R(r) < 6). The Laplace-Stieltjes transform is B[—s]. 
B(x) the tail 1 — B(x) = P(X > x) of B. 
A(x) the failure rate of the distribution B, i.e. A(x) = b(x)/B(2). 


||G|| the total mass (variation) of a (signed) measure G. In particular, for 
a probability distribution ||G|| = 1, and for a defective probability 
distribution ||G] < 1. 


us the mean EX = f xB(dz) of B. 

pO the nth moment EX” = f 2” B(dz) of B. 
I(A) the indicator function of the event A. 
(|X; A] means E[XI(A)]. 


marks the end of a proof, an example or a remark. 
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X;— the left limit lims;; Xs, i.e. the value just before t. 
C” the space of functions that are n times continuously differentiable. 


D[0,00) the space of R-valued functions which are right-continuous and 
have left limits. Unless otherwise stated, all stochastic processes con- 
sidered in this book are assumed to have sample paths in this space. 
Usually, the processes we consider are piecewise continuous, i.e. only 
have finitely many jumps in each finite interval. Then the assumption 
of D-paths just means that we use the convention that the value at 
each jump epoch is the right limit rather than the left limit. 

In the French-inspired literature, often the term ‘cadlag’ (continues 
a droite avec limites a gauche) is used for the D-property. 


N(p,07) the normal distribution with mean pu and variance o°. 


Matrices and vectors 
are denoted by bold letters. Usually, matrices have uppercase Roman 
or Greek letters like T, A, row vectors have lowercase Greek letters 
like a, m, and column vectors have lowercase Roman letters like t, 
a. In particular: 
I is the identity matrix. 
e is the column vector with all entries equal to 1. 
e; is the ith unit column vector, i.e. the ith entry is 1 and all other 


0. 
(the dimension is usually clear from the context and left unspecified 
in the notation). For a given set x1,..., p of numbers, 


(2; )diag denotes the diagonal matrix with the x; on the diagonal 
(2;)row denotes the row vector with the x; as components 
Xi)co! denotes the column vector with the x; as components. 
T is the transposition operator acting on vectors or matrices. E.g, 
the ith unit row vector is el. 


Special notation for risk processes 


A, the total claims up to time t. 

@ the arrival intensity (when the arrival process is Poisson). Notation 
like 6; and G(t) in Chapter VII has a similar, though slightly more 
complicated, intensity interpretation. 


B the claim size distribution. Notation like B; and B®) in Chapter VII 
has a similar, though slightly more complicated, interpretation. 
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Bo the stationary excess (integrated tail) distribution of B. 
B, Brownian motion. 


ô the rate parameter of B for the exponential case B(x) = e~*”. Alterna- 
tively, at some places the discount rate for time-dependent consider- 
ations. 


n the safety loading, cf. I.1. 


y The adjustment coefficient (for negative drift, the positive solution of 
(7) = 0). 

ys The adjustment coefficient in a time-dependent context (for negative 
drift, the smallest positive solution of x(a) = ô). 


K(a) the c.g.f. of the increment distribution in the context of discrete- 
time random walks; for continuous-time processes with stationary 
and independent increments, x(a) is the Lévy exponent as defined in 
III.(3.5). For Markov additive processes, the corresponding extension 
is discussed in II.4 and VI.3b. In Section XVI.1, an adaptation for 
processes with dependent increments is used. 


m(u) the Gerber-Shiu function for initial capital u. 


v The Lévy measure. At some places also the rate parameter of an expo- 
nential distribution. 


o Depending on the context it is sometimes used as a symbol for a m.g.f. 
and sometimes as the survival probability ¢(u) = 1 — y(u). 


w(u) the ruin probability for initial capital u. 

w(u,T) the ruin probability up to time T for initial capital u. 

Y [—s] the Laplace transform fe e7 *“w(u) du of the ruin probability. 
R, the risk reserve process at time t. 


p the net amount upg of claims per unit time, or quantities with a similar 
time average interpretation, cf. I.1. 


ps The absolute value of the largest non-positive solution of the time- 
dependent Lundberg equation (for negative drift, k(—ps) = ô). 


S: the claim surplus process (i.e. aggregate loss process) u — R, at time t. 
T(u) the time of ruin for initial capital u. 


u usually the initial capital (in the chapter on control, u denotes the 
control strategy). 


W; sometimes an alternative notation for B+, i.e. Brownian motion. 
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E(u) the deficit at ruin of the process R, starting in u (or, equivalently, 
the overshoot over u of S;). 


Pz, Ez the probability measure and its corresponding expectation corre- 
sponding to the exponential change of measure given by Lundberg 
conjugation, cf. e.g. IV.5, VII.5. 
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Chapter I 


Introduction 


1 The risk process 


In this chapter, we give a very brief summary of some of the models, results and 
topics to be studied in the rest of the book, and some terminology is introduced. 

A risk reserve process {R;},.9, as defined in broad terms, is a model for the 
time evolution of the reserves of an insurance company. We denote throughout 
the initial reserve by u = Ro. The probability Y(u) of ultimate ruin is the 
probability that the reserve ever drops below zero, 


blu) = P( inf Ry <0) = P( inf Ri < 0| Ro =u). 
The probability of ruin before time T is 
pu, T) = p( it, R< 0). 


We also refer to y(u) and y(u, T) as ruin probabilities with infinite horizon and 
finite horizon, respectively. They are the main topics of study of the present 
book. 

For mathematical purposes, it is frequently more convenient to work with 
the claim surplus process (also called aggregate loss process) {5}, defined by 
Sı = u — Ri. Letting a 


T(u) = inf{t> 0: Rı <0} = inf{t>0:S,>u}, (1.1) 
M = sup Si, Mr = sup &, (1.2) 
0<t<oo O<t<T 
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be the time to ruin and the maxima with infinite and finite horizon, respectively, 
the ruin probabilities can then alternatively be written as 


y(u) = P(t(u)<oo) = P(M >u), (1.3) 
y(u, T) = P(r(u) <T) = P(Mr>u). (1.4) 


So far we have not imposed any assumptions on the risk reserve process. 
However, the following set-up will cover a main part of the book: 


e There are only finitely many claims in finite time intervals. That is, 
the number N; of arrivals in [0,t] is finite. We denote the interarrival 
times of claims by T>,73,... and T; is the time of the first claim. Thus, 
the time of arrival of the nth claim is on = Ti +- + Tn, and N; = 
min{n > 0: oni >t} = max{n>0: on <t}. 


e The size of the nth claim is denoted by Un. 
e Premiums flow in at rate p, say, per unit time. 


Putting things together, we see that 


Ni Ni 
Re = utpt-S Ur, S:= 5) Ur- pt. (1.5) 
k=1 k=1 


The sample paths of {R,} and {S;} and the connection between the two 
processes are illustrated in Fig. I.1. 


FIGURE I.1 
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Note that it is a matter of taste (or mathematical convenience) whether one 
allows {R;} and/or {S,} to continue its evolution after the time 7(u) of ruin. 
Thus, for example, one could well replace Ry by Ripz(u) Or Rear(u) VO. For the 
purpose of studying ruin probabilities this distinction is, of course, immaterial. 
Some main examples of models not incorporated in the above set-up are: 


e Models which are non-homogeneous in space, for example with a premium 
depending on the reserve (i.e. on Fig. I.1 the slope of {R+} should depend 
also on the level). We study this case in Chapter VIII. 


e Brownian motion or more general diffusions. Traditionally, Brownian mo- 
tion has mainly been used as an approximation to the risk process rather 
than as a model of intrinsic merit and we look at this in Chapter V. How- 
ever, since any modeling involves some approximative assumptions, it has 
(partly inspired from the modeling in mathematical finance) become more 
and more common to use Brownian motion as an intrinsically reasonable 
model. 


e General Lévy processes (defined as continuous time processes with sta- 
tionary independent increments) where the jump component has infinite 
Lévy measure, allowing a countable infinity of jumps on Fig. I.1. We treat 
Lévy processes in Chapter XI. 


The models we consider will often have the property that there exists a 
constant p such that 


N: 
1 a.s. 
= Uk > p, tow. (1.6) 
k=1 


The interpretation of p is as the average amount of claim per unit time. A 
further basic quantity is the safety loading (or the security loading) n defined as 
the relative amount by which the premium rate p exceeds p, 


It is sometimes stated in the theoretical literature that the typical values of the 
safety loading ņ are relatively small, say 10% — 20%; we shall, however, not 
discuss whether this actually corresponds to practice. It would appear obvious, 
however, that the insurance company should try to ensure ņ > 0, and in fact: 


Proposition 1.1 Assume that (1.6) holds. If ņ < 0, then M = œ a.s. and 
hence (u) = 1 for allu. If > 0, then M < œ a.s. and hence y(u) < 1 for 
all sufficiently large u. 
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Proof. It follows from (1.6) that 


St E i Uk — pt a.s. p 


A 00; 
t t i 


If n < 0, then this limit is > 0 which implies S; “4 00 and hence M = œ as. If 
7 > 0, then similarly lim S;/t < 0, S; 2$ oo, M < œ a.s. 


In concrete models, we obtain typically a somewhat stronger conclusion, 
namely that M = œ a.s., y(u) = 1 for all u holds also when 7 = 0, and that 
w(u) < 1 for all u > 0 when 7 > 0. However, this needs to be verified in each 
separate case. 

The simplest concrete example (to be studied in Chapter IV) is the Cramér- 
Lundberg or compound Poisson model, where {N;} is a Poisson process with 
rate 8 (say) and U1, U2,... are i.i.d. and independent of {N;}. Here it is easy to 
see that p = BEU (on the average, 8 claims arrive per unit time and the mean 
of a single claim is EU) and that also 


1 
lim E=S°U, = p. (1.7) 


t— oo t 


Again, (1.7) is a property which we will typically encounter. However, not all 
models considered in the literature have this feature: 


Example 1.2 (COX PROCESSES) Here {N;} is a Poisson process We a 


rate 3(t) (say) at time t. If U1, U2, ... are i.i.d. and independent of {(8(t), Ni) }, 
it is not too difficult to show that p as defined by (1.6) is given by 
pim E t + S ao) pis 


(provided the limit exists). Thus pọ may well be random for such processes, 
namely, if { (t)} is non-ergodic. The simplest example is (t) = V where V 
is ar.v. This case is referred to as the mized Poisson process, with the most 
notable special case being V having a Gamma distribution, corresponding to 
the Pólya process. 


We shall only encounter a few instances of a Cox process, in connection with 
risk processes in a Markovian or periodic environment (Chapter VII), and here 
(1.6), (1.7) hold with p constant. 
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Proposition 1.3 Assume p 4 1 and define R, = Rijp. Then the connection 
between the ruin probabilities for the given risk process {R,} and those y(u), 


b(u,T) for {Ra} is given by 


vu) =u), plu, T) = (u, Tp). 


The proof is trivial. Since {R,} has premium rate 1, the role of the result is 


to justify taking p = 1, which is feasible since in most cases the process {R,} 
has a similar structure as {R+} (for example, the claim arrivals are Poisson or 
renewal at the same time). Note that when p = 1, the assumption 7 > 0 is 
equivalent to p < 1; in a number of models, we shall be able to identify p with 
the traffic intensity of an associated queue, and in fact p < 1 is the fundamental 
assumption of queueing theory ensuring steady-state behavior (existence of a 
limiting stationary distribution). 


Notes and references The study of ruin probabilities, often referred to as collec- 
tive risk theory or just risk theory, was largely initiated in Sweden in the first half of 
the century. Some of the main general ideas were laid down by Lundberg [614], while 
the first mathematically substantial results were given in Lundberg [615] and Cramér 
[265]; another important early Swedish work is Tacklind [826]. The Swedish school 
was pioneering not only in risk theory, but also in probability and applied probability 
as a whole; in particular, many results and methods in random walk theory originate 
from there and the area was ahead of related ones like queueing theory. 

Some early surveys are given in Cramér [265], Segerdahl [792] and Philipson [699]. 
Some main later textbooks are (in alphabetical order) Biithlmann [208], Dickson [309], 
Daykin, Pentikäinen & Pesonen [279], De Vylder [300], Gerber [398], Grandell [429], 
Rolski, Schmidli, Schmidt & Teugels [746] and Seal [784, 788]. Besides in standard 
journals in probability and applied probability, the research literature is often published 
in journals like Astin Bulletin, Insurance: Mathematics and Economics, the North 
American Actuarial Journal, the Scandinavian Actuarial Journal and Mitteilungen 
der Schweizerischen Aktuarvereinigung. Note that the latter has recently been merged 
with Blätter der Deutschen Gesellschaft für Versicherungs- und Finanzmathematik 
and a number of further Actuarial Bulletins of European countries into The European 
Actuarial Journal. 

The term risk theory is often interpreted in a broader sense than as just to comprise 
the study of ruin probabilities. An idea of the additional topics and problems one may 
incorporate under risk theory can be obtained from the survey paper [665] by Norberg; 
see also Chapter XVI. In the even more general area of non-life insurance mathematics, 
some main texts (typically incorporating some ruin theory but emphasizing the topic 
to a varying degree) are Bowers et al. [195], Bühlmann [208], Daykin et al. [279], 
Embrechts et al. [349], Heilmann [458], Hipp & Michel [468], Kaas et al. [515], 
Klugman, Panjer & Willmot [536], Mikosch [638], Schmidt [782], Straub [818], Sundt 
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[820] and Taylor [840]. Note that life insurance (e.g. Gerber [402]) has a rather different 
flavor, and we do not get near to the topic anywhere in this book. 

Cox processes are treated extensively in Grandell [429]. For mixed Poisson pro- 
cesses and Pólya processes, see e.g. the recent survey by Grandell [431] and references 
therein. 


2 Claim size distributions 


This section contains a brief survey of some of the most popular classes of 
distributions B which have been used to model the claims U1, U2,... We roughly 
classify these into two groups, light-tailed distributions (sometimes the term 
‘Cramér-type conditions’ is used), and heavy-tailed distributions. Here light- 
tailed means that the tail B(x) = 1 — B(x) satisfies B(x) = O(e~**) for some 
s > 0. Equivalently, the m.g.f. B[s] is finite for some s > 0. In contrast, B is 
heavy-tailed if B[s] = co for all s > 0, but different more restrictive definitions 
are often used: subexponential, regularly varying (see below) or even regularly 
varying with infinite variance. On the more heuristical side, one could mention 
also the folklore in actuarial practice to consider B heavy-tailed if ‘20% of the 
claims account for more than 80% of the total claims’, i.e. if 
1 lo) 
— xz B(dx) > 0.8, 
HB Jo.» 


where B(bp.2) = 0.2 and upg is the mean of B. 


2a Light-tailed distributions 


Example 2.1 (THE EXPONENTIAL DISTRIBUTION) Here the density is 
b(x) = de~™. (2:1) 


The parameter 6 is referred to as the rate or the intensity, and can also be 
interpreted as the (constant) failure rate b(x)/B(2). 

As in a number of other applied probability areas, the exponential distribu- 
tion is by far the simplest to deal with in risk theory as well. In particular, for 
the compound Poisson model with exponential claim sizes the ruin probability 
w(u) can be found in closed form. The crucial feature is the lack of memory: if 
U is exponential with rate 6, then the conditional distribution of U — x given 
U > zx is again exponential with rate 6 (this is essentially equivalent to the fail- 
ure rate being constant). For example in the compound Poisson model, a simple 
stopping time argument shows that this implies that the conditional distribution 
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of the overshoot $;(,) — u at the time of ruin given T(u) is again exponential 
with rate 6, a fact which turns out to contain considerable information. 


Example 2.2 (THE GAMMA DISTRIBUTION) The gamma distribution with pa- 
rameters p, ô has density 


oP p—1,—é6a 
b(x) = rey" e7? (2.2) 


Bs] = (4). 8 <6. 


The mean EU is p/6 and the variance Var U is p/6?. In particular, the squared 
coefficient of variation (s.c.v.) 


and m.g.f. 


VarU 1 
(EU)? p 
is < 1 for p > 1, > 1 for p < 1 and = 1 for p = 1 (the exponential case). 
The exact form of the tail B(x) is given by the incomplete Gamma function 


T(z; p), 


z= T(x; p) a Er a 
B(x) = where I(x; p) = t~ e™ dt. 
(x) Tp) (x; p) J 
Asymptotically, one has 
= geo 
B(x) ~ rP te 8 , 
© ~ T) 


In the sense of the theory of infinitely divisible distributions, the Gamma 
density (2.2) can be considered as the pth power of the exponential density 
(2.1) (or the 1/pth root if p < 1). In particular, if p is integer and U has the 


gamma distribution p, 6, then U 2 Xı +::-+Xp where X1, X2,... are i.i.d. and 
exponential with rate 6. This special case is referred to as the Erlang distribution 
with p stages, or just the Erlang(p) distribution. An appealing feature is its 
simple connection to the Poisson process: B(x) = P(U; + ---+U, > 2) is the 
probability of at most p — 1 Poisson events in [0,2] so that 


B(«) a Sree (öz)! 


i=0 


i! 


In the present text, we develop computationally tractable results mainly for 
the Erlang case (i.e. p € N). Ruin probabilities for the general case have been 
studied, among others, by Grandell & Segerdahl [433] and Thorin [847]. 
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Example 2.3 (THE HYPEREXPONENTIAL DISTRIBUTION) This is defined as a 
finite mixture of exponential distributions, 


Pp 
W(x) = X` adie (2.3) 
=i 


where Pe a, = 1, 0 < a; < 1, i = 1,...,p. An important property of the 
hyperexponential distribution is that its s.c.v. is > 1. 

If a; € R, then one speaks of the distribution as a combination of exponentials 
and this class is dense in the set of all distributions on the positive halfline. 


Example 2.4 (PHASE-TYPE DISTRIBUTIONS) A phase-type distribution is the 
distribution of the absorption time in a Markov process with finitely many states, 
of which one is absorbing and the rest transient. Important special cases are 
the exponential, the Erlang and the hyperexponential distributions. This class 
of distributions plays a major role in this book as one within computationally 
tractable exact forms of the ruin probability y(u) can be obtained. 

The parameters of a phase-type distribution are the set E of transient states, 
the restriction T of the intensity matrix of the Markov process to EF and the 
row vector @ = (a;)iex of initial probabilities. The density and c.d.f. are 


b(x) = ae™*t, resp. B(r) = aeT”e, xz>0, 


where t = Te and e = (1... 1)" is the column vector with 1 at all entries. 
The couple (œ, T) or sometimes the triple (E, a, T) is called the representation. 
We give a more comprehensive treatment in IX.1 and defer further details to 
Chapter IX. 


Example 2.5 (DISTRIBUTIONS WITH RATIONAL TRANSFORMS) A distribution 
B has a rational m.g.f. (or, equivalently, a rational Laplace transform) if Bir] = 
p(r)/q(r) with p(r) and q(r) being polynomials of finite degree. An equiva- 
lent characterization is that the density b(x) is the solution of a homogeneous 
ordinary differential equation with constant coefficients 


bD (x) + dy- H (£) +--+» +.do =0; dj ER, do £0, 
where one of the initial conditions is determined by te b(a) da = 1. Conse- 


quently the density b(x) has one of the forms 


q 
be) = > grr, (2.4) 
j=0 


=a 
— 

8 
wa 

II 


qı q2 q3 
5 cjxei? + 5 d,x) cos(a;x)e°o® + D e;x! sin(bjx)e“” , (2.5) 
j=0 j=0 j=0 
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where the parameters in (2.4) are possibly complex-valued but the parameters 
in (2.5) are real-valued. This class of distributions is popular in the literature on 
both risk theory and queues, but often the attention is restricted to the class of 
phase-type distributions, which is slightly smaller but more amenable to proba- 
bilistic reasoning. We give some theory for matrix-exponential distributions in 
IX.6. 


Example 2.6 (DISTRIBUTIONS WITH BOUNDED SUPPORT) This example (i.e. 
there exists a xg < co such that B(x) = 0 for x > zo, B(x) > 0 for x < xo) is of 
course a trivial instance of a light-tailed distribution. However, it is notable from 
a practical point of view because of reinsurance: if excess-of-loss reinsurance has 
been arranged with retention level xo, then the claim size which is relevant from 
the point of view of the insurance company itself is U A xo rather than U (the 
excess (U — 29)t is covered by the reinsurer). See XVI.6. 


2b Heavy-tailed distributions 


Example 2.7 (THE WEIBULL DISTRIBUTION) This distribution originates from 
reliability theory. Here failure rates 6(2) = b(x)/B(a) play an important role, 
the exponential distribution representing the simplest example since here 6(2) is 
constant. However, in practice one may observe that ô(x) is either decreasing or 
increasing and may try to model smooth (increasing or decreasing) deviations 
from constancy by 6(a) = dx’! (0 < r < co). Writing c = d/r, we obtain the 
Weibull distribution 


B(x) =e", b(x) = erele", (2.6) 


which is heavy-tailed when 0 < r < 1. All moments are finite. Another in- 
terpretation is that it is the distribution of X!/", where X is exponential with 
parameter c. 


Example 2.8 (THE LOGNORMAL DISTRIBUTION) The lognormal distribution 
with parameters o?, u is defined as the distribution of eV where V ~ N(, 07), 
or equivalently as the distribution of e°W+# where W ~ N(0,1). It follows that 
the density is 


yah = Boen) w(t 
1 1/1 — u2 
~ zo aaa e 2) k (2.7) 


Asymptotically, the tail is 


EET 


10 CHAPTER I. INTRODUCTION 


which is heavier than the one of the Weibull distribution. The lognormal dis- 
tribution has moments of all orders. In particular, the mean is e+? /? and the 

2 
second moment is e2#+27", 


Example 2.9 (THE PARETO DISTRIBUTION) Here the essence is that the tail 
B(x) decreases like a power of x. There are various variants of the definition 
around, the simplest one being 


By =°, weed, (2.9) 


which can be interpreted as the distribution of e* for an exponential r.v. X 
with parameter a. Another variant is often referred to as US-Pareto and defined 
by 


a 


aa“ 
——., b SS >0 2.10 
(a+ x)’ (x) (a +x)? TZU, ( ) 
for some a > 0. The pth moment is finite if and only if p< a — 1. 

The Laplace-Stieltjes transform of the Pareto distribution defined in (2.9) 


can be expressed through the incomplete Gamma function by 


B(x) = 


co 
Bl-s] = f ee dz = a s°T(—a, s). 
Similarly, the Laplace-Stieltjes transform of the US Pareto distribution is B [—s] = 
a (as)“e"*T'(—a, as). These relatively simple expressions have not always been 
noted. 

Abate, Choudhury & Whitt [1] introduced a somewhat related class of ran- 
dom variables called Pareto mixture of exponentials, which are products of 
Pareto and exponential r.v.’s and lead to quite explicit Laplace-Stieltjes trans- 
forms. 


Example 2.10 (THE LOGGAMMA DISTRIBUTION) The loggamma distribution 
with parameters p, 6 is defined as the distribution of eV where V has the gamma 


density (2.2). The density is 
_ 6(log x)?—* 
(x) = SIT (p) (2.11) 


The pth moment is finite if p < ô and infinite if p > 6. For p = 1, the loggamma 
distribution is a Pareto distribution. 


Example 2.11 (DISTRIBUTIONS WITH REGULARLY VARYING TAILS) The tail 
B(x) of a distribution B is said to be regularly varying with index a if 
L(x) 


ee’ 


B(x) ~ LO, (2.12) 
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where L(x) is slowly varying, i.e. satisfies L(at)/L(x) > 1 as x > œ (any L 
having a limit in (0,00) is slowly varying; another standard example is (log x)"). 
Thus, examples of distributions with regularly varying tails are the Pareto dis- 
tribution (2.10) (here L(x) — 1), the loggamma distribution (with index 6) and 
a Pareto mixture of exponentials. 


Example 2.12 (THE SUBEXPONENTIAL CLASS OF DISTRIBUTIONS) We say that 
a distribution B is subexponential if 


lim = = 2, (2.13) 


It can be proved (see X.1) that any distribution with a regularly varying tail is 
subexponential. Also, for example the lognormal distribution is subexponential 
(but not regularly varying), though the proof of this is non-trivial, and so is 
the Weibull distribution with 0 < r < 1. Thus, the subexponential class of 
distributions provide a convenient framework for studying large classes of heavy- 
tailed distributions. We return to a closer study in X.1. 


When studying ruin probabilities, it will be seen that we obtain completely 
different results depending on whether the claim size distribution is exponen- 
tially bounded or heavy-tailed. From a practical point of view, this phenomenon 
represents one of the true controversies of the area. Namely, the knowledge of 
the claim size distribution will typically be based upon statistical data, and 
based upon such information it seems questionable to extrapolate to tail be- 
havior. However, one may argue that this difficulty is not restricted to ruin 
probability theory alone. Similar discussion applies to the distribution of the 
accumulated claims (XVI.2) or even to completely different applied probability 
areas like extreme value theory: if we are using a Gaussian process to predict 
extreme value behavior, we may know that such a process (with a covariance 
function estimated from data) is a reasonable description of the behavior of the 
system under study in typical conditions, but can never be sure whether this 
is also so for atypical levels for which far less detailed statistical information is 
available. We give some discussion on standard methods to distinguish between 
light and heavy tails in Chapter X. 


3 The arrival process 


For the purpose of modeling a risk process, the claim size distribution represents 
of course only one aspect (though a major one). At least as important is the 
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specification of the structure of the point process {N;} of claim arrivals and its 
possible dependence with the claims. 

By far the most prominent case is the compound Poisson (Cramér-Lundberg) 
model where {N;} is Poisson and independent of the claim sizes U1, U2,... The 
reason is in part mathematical since this model is the easiest to analyze, but 
the model also admits a natural interpretation: a large portfolio of insurance 
holders, which each have a (time-homogeneous) small rate of experiencing a 
claim, gives rise to an arrival process which is very close to a Poisson process, 
in just the same way as the Poisson process arises in telephone traffic (a large 
number of subscribers each calling with a small rate), radioactive decay (a huge 
number of atoms each splitting with a tiny rate) and many other applications. 
The compound Poisson model is studied in detail in Chapters IV, V (and, with 
the extension to premiums depending on the reserve, in Chapter VIII). 

To the authors’ knowledge, not so many detailed studies of the goodness-of-fit 
of the Poisson model in insurance are available. Some of them have concentrated 
on the marginal distribution of Nr (say T = one year), found the Poisson dis- 
tribution to be inadequate and suggested various other univariate distributions 
as alternatives, e.g. the negative binomial distribution. The difficulty in such 
an approach lies in that it may be difficult or even impossible to imbed such a 
distribution into the continuous set-up of {.N;} evolving over time, and also that 
the ruin problem may be hard to analyze. Nevertheless, getting away from the 
simple Poisson process seems a crucial step in making the model more realistic, 
in particular to allow for certain inhomogeneities. 

Historically, the first extension to be studied in detail was {N;} to be renewal 
(the interarrival times T,,7>,... are iid. but with a general not necessarily 
exponential distribution). This model, to be studied in Chapter VI, has some 
mathematically appealing random walk features, which facilitate the analysis. 
However, it is more questionable whether it provides a model with a similar 
intuitive content as the Poisson model. One could possibly argue that renewal 
models are a compromise between choosing a tractable model and taking into 
account statistical information that may indicate that exponential interarrival 
time distributions do not calibrate given data well enough. Of course, one is then 
still left to believe in the independence assumption and — with the introduced 
memory between claims — one has to be aware that the resulting model is for 
most applications to be seen as an interpolation rather than a causal model. 

A more appealing way to allow for inhomogeneity is by means of an intensity 
P(t) fluctuating over time. An obvious example is 3(t) depending on the time 
of the year (the season), so that 6(t) is a periodic function of t; we study this 
case in VII.6. Another one is Cox processes, where {((t)},5, is an arbitrary 
stochastic process. In order to prove reasonably substantial and interesting 
results, Cox processes are, however, too general and one needs to specialize to 
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more concrete assumptions. The one we focus on (Chapter VII) is a Markovian 
environment: the environmental conditions are described by a finite Markov 
process {Jt},>o; such that O(t) = 6; when J, = i. Le. with a common term 
{N;} is a Markov-modulated Poisson process; its basic feature is to allow more 
variation (bursty arrivals) than inherent in the simple Poisson process. This 
model can be intuitively understood in some simple cases like {J} describing 
weather conditions in car insurance, epidemics in life insurance etc. In others, it 
may be used in a purely descriptive way when it is empirically observed that the 
claim arrivals are more bursty than allowed for by the simple Poisson process. 

Mathematically, the periodic and the Markov-modulated models also have 
attractive features. The point of view we take here is Markov-dependent random 
walks in continuous time (Markov additive processes), see II.4. This applies 
also to the case where the claim size distribution depends on the time of the year 
or the environment (VII.6), and which seems well motivated from a practical 
point of view as well. 


4 A summary of main results and methods 


4a Duality with other applied probability models 


Risk theory may be viewed as one of many applied probability areas, others being 
branching processes, genetics models, queueing theory, dam/storage processes, 
reliability, interacting particle systems, stochastic differential equations, time 
series and Gaussian processes, extreme value theory, stochastic geometry, point 
processes and so on. Some of these have a certain resemblance in flavor and 
methodology, others are quite different. 

The ones which appear most related to risk theory are queueing theory and 
dam/storage processes. In fact, it is a recurrent theme of this book to stress 
this connection which was often neglected in the early specialized literature on 
risk theory. Mathematically, the classical result is that the ruin probabilities for 
the compound Poisson model are related to the workload (virtual waiting time) 
process {V;},, of an initially empty M/G/1 queue by means of 


y(u, T) = P(Vr > u), wv(u)=P(V >u), (4.1) 


where V is the limit in distribution of V, as t —> œo. The M/G/1 workload 
process {V;} may also be seen as one of the simplest storage models, with Poisson 
arrivals and constant release rule p(x) = 1. A general release rule p(x) means 
that {V;} decreases according to the differential equation V = —p(V) in between 
jumps, and here (4.1) holds as well provided the risk process has a premium 
rule depending on the reserve, R = p(R) in between jumps. Similarly, ruin 
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probabilities for risk processes with an input process which is renewal, Markov- 
modulated or periodic can be related to queues with similar characteristics. 
Thus, it is desirable to have a set of formulas like (4.1) permitting to translate 
freely between risk theory and the queueing/storage setting. In Chapter VIII 
we will also see a direct and natural link between the maximum workload of 
an M/G/1 queue and the ruin probability in a compound Poisson risk model 
in terms of excursions. In general, methods or modeling ideas developed in one 
area often have relevance for the other one as well. 

A stochastic process {V;} is said to be in the steady state if it is strictly 
stationary (in the Markov case, this amounts to Vo having the stationary distri- 
bution of {V;}), and the limit t — oo is the steady-state limit. The study of the 
steady state is by far the most dominant topic of queueing and storage theory, 
and a lot of information on steady-state r.v.’s like V is available. It should be 
noted, however, that quite often the emphasis is on computing expected values 
like EV. In the setting of (4.1), this gives only ie w(u)du which is of limited 
intrinsic interest. Thus, the two areas, though overlapping, have to some extent 
a different flavor. 

A prototype of the duality results in this book is Theorem II.2.1, which gives 
a sample path version of (4.1) in the setting of a general premium rule p(x): the 
events {Vr > u} and {7(u) < T} coincide when the risk process and the storage 
process are coupled in a suitable way (via time-reversion). The infinite horizon 
(steady state) case is covered by letting T — oo. The fact that Theorem III.2.1 
is a sample path relation should be stressed: in this way the approach also 
applies to models having supplementary r.v.’s like the environmental process 
{Jı} in a Markov-modulated setting. 


4b Exact solutions 


Of course, the ideal is to be able to come up with closed form solutions for the 
ruin probabilities y(u), y(u, T). The cases where this is possible are basically 
the following for the infinite horizon ruin probability y(u): 


e The compound Poisson model with constant premium rate p = 1 and 
exponential claim size distribution B, B(x) = e~°®*. Here y(u) = pe~™ 
where £ is the arrival intensity, p = 3/6 and y = ô — £. 


e The compound Poisson model with constant premium rate p = 1 and B 
being phase-type with just a few phases. Here y(u) is given in terms of 
a matrix-exponential function (Corollary TX.3.1), which can be expanded 
into a sum of exponential terms by diagonalization (see, e.g., Example 
IX.3.2). The qualifier ‘with just a few phases’ refers to the fact that the 
diagonalization has to be carried out numerically in higher dimensions. 
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e The compound Poisson model with a claim size distribution degenerate at 
one point, see Corollary IV.3.7. 


e The compound Poisson model with some rather special heavy-tailed claim 
size distributions, see Boxma & Cohen [193] and Abate & Whitt [3]. 


e The compound Poisson model with premium rate p(x) depending on the 
reserve and exponential claim size distribution B. Here w(u) is explicit 
provided that, as is typically the case, the functions 


cam ee A 
w(x) = — dy, | ehule) =e dy 
w a 


can be written in closed form, see Corollary VIII.1.9. 


e The compound Poisson model with a piecewise constant premium rule 
p(x) and B being phase-type with just a few phases, see IX.7. 


e Renewal models with exponential claim sizes, see Theorem VI.2.2. 


e Renewal model variants of the above cases for which the interclaim time 
is phase-type with just a few phases. 


e Any Lévy model where the risk reserve process (not the claim surplus 
process!) is downward skipfree (Theorem XI.2.3). This includes Brownian 
motion. 


e Any Lévy model for which the scale function is explicitly available, see 
XI.3 (for an early example cf. Furrer [381]). 


A further notable fact (see again XVI.1) is the explicit form of the ruin prob- 
ability when {R;} is a diffusion with infinitesimal drift and variance p(x), o° (x): 


Jn exp {= Jo 2u(y)/o7(y) dy} de _ | _ S(u) 
Jo. exp {= fo 2u(y)/o?(y) dy} dx S(0o) 


su) = f eof- f 2utn/o%ray} as 


is the natural scale. 

The finite horizon ruin probability w(u,T) is explicit for Brownian motion 
(III.1) and the compound Poisson model with exponential claim size distribu- 
tion (V.1). Later in the book a number of further rather specific cases will be 
discussed for which explicit expressions exist. 


pu) = 


where 
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4c Numerical methods 


Next to a closed-form solution, the second best alternative is a numerical pro- 
cedure which allows to calculate the exact values of the ruin probabilities. Here 
are some of the main approaches: 


Laplace transform inversion Often, it is easier to find the Laplace trans- 
forms 


EE f E E E f eT Wu, T) dudT 


in closed form rather than the ruin probabilities y(u), y(u, T) themselves. 
In that case y(u), y(u, T) can be calculated numerically by some method 
for transform inversion, say the fast Fourier transform (FFT) as imple- 
mented in Griibel [438] for infinite horizon ruin probabilities for the re- 
newal model. We do not discuss Laplace transform inversion much; other 
relevant references are Abate & Whitt [2], Embrechts, Griibel & Pitts [346] 
and Griibel & Hermesmeier [439]; see also Albrecher, Avram & Kortschak 
[14] and the Bibliographical Notes in [746, p. 191]. 


Matrix-analytic methods This approach is relevant when the claim size dis- 
tribution is of phase-type (or matrix-exponential), and in quite a few cases 
(Chapter IX), (u) is then given in terms of a matrix-exponential func- 
tion eV" (here U is some suitable matrix) which can be computed by 
diagonalization, as the solution of linear differential equations or by some 
series expansion (not necessarily the straightforward X> U"u/n! one!). 
In the compound Poisson model with p = 1, U is explicit in terms of the 
model parameters, whereas for the renewal arrival model and the Marko- 
vian environment model U has to be calculated numerically, either as the 
iterative solution of a fixed point problem or by finding the diagonal form 
in terms of the complex roots to certain transcendental equations. 


Differential- and integral equations The idea here is to express ~(u) or 
w(u, T) as the solution to a differential- or integral equation, and carry 
out the solution by some standard numerical method. One example where 
this is feasible is the renewal equation for y(u) (Corollary IV.3.3) in the 
compound Poisson model which is an integral equation of Volterra type. 
The method is, however, restricted to models that have a certain degree 
of Markovian structure in which case conditioning (or applying the more 
formal tool of generators, see II.4a) leads to equations that often involve 
both differential and integral terms. We will discuss cases where this 
approach can even lead to explicit solutions (see e.g. IX.7 and XII.3c). In 
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4d 


many more cases, numerical solution methods are applicable, although the 
initial or boundary conditions can be a challenge. 


If an integral equation is available, it is often possible to define a contrac- 
tive integral operator and to identify the ruin probability as its fixed point, 
in which case the ruin probability can be approximated by iterated appli- 
cation of the integral operator to some starting function. The resulting 
high-dimensional integral can then be calculated by standard Monte Carlo 
and Quasi-Monte Carlo techniques (see e.g. Albrecher et al. [31, 24]). In 
comparison to the alternative of direct simulation of the risk process (as 
discussed in Section 4g), this technique often has significant computational 
advantages over the latter. 


Approximations 


The Cramér-Lundberg approximation This is one of the most celebrated 


results of risk theory (and probability theory as a whole). For the com- 
pound Poisson model with p = 1 and claim size distribution B with mo- 
ment generating function (m.g.f.) B[s], it states that 


plu) ~ Ce, urn, (4.3) 


where C = (1 — p) /(6B' — 1) and y > 0 is the solution of the Lundberg 
equation 


b(Bhl-1)-7 = 0, (4.4) 
which can equivalently be written as 
B = Y 
Bly] = 1+ a (4.5) 


It is rather standard to call y the adjustment coefficient but a variety of 
other terms are also frequently encountered (and often the notation R in- 
stead of y is used in the literature). The Cramér-Lundberg approximation 
is renowned not only for its mathematical beauty but also for being very 
precise, often for all u > 0 and not just for large u. It has generalizations 
to other Lévy models, to the models with renewal arrivals, a Markovian 
environment or periodically varying parameters. However, in such cases 
the evaluation of C is more cumbersome. In fact, when the claim size 
distribution is of phase-type, the exact solution is as easy to compute as 
the Cramér-Lundberg approximation in some of these models. 


The shape of the l.h.s. of equation (4.4) and its extensions to other models 
lie at the heart of ruin theory. Its level sets (not only the one at 0) reveal a 
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lot of (in particular asymptotic) properties of ruin-related quantities and 
will play an important role in this book. 


Diffusion approximations Here the idea is simply to approximate the risk 
process by a Brownian motion (or a more general diffusion) by fitting the 
first and second moments, and to use the fact that first passage proba- 
bilities are more readily calculated for diffusions than for the risk process 
itself. Diffusion approximations are easy to calculate, but typically not 
very precise in their first naive implementation. However, incorporating 
correction terms may change the picture dramatically. In particular, cor- 
rected diffusion approximations (see V.6) are by far the best one can do 
in terms of finite horizon ruin probabilities y(u, T). 


Large claims approximations In order for the Cramér-Lundberg approxi- 
mation to be valid, the claim size distribution should have an exponentially 
decreasing tail B(x). In the case of heavy-tailed distributions, other ap- 
proaches are thus required. Approximations for y(u) as well as for y(u, T) 
for large u are available in most of the models we discuss. For example, 
for the compound Poisson model under certain assumptions on B 


p gs 
plu) ~ ais / B(x)dz, u> ow. (4.6) 


In fact, in some cases the results are even more complete than for light 
tails. See Chapter X. 


This list of approximations does by no means exhaust the topic; some further 
possibilities are surveyed in IV.7 and V.2. 


4e Bounds and inequalities 


The outstanding result in the area is Lundberg’s inequality 
plu) < e. (4.7) 


Compared to the Cramér-Lundberg approximation (4.3), it has the advantage 
of not involving approximations and also, as a general rule, of being somewhat 
easier to generalize beyond the compound Poisson setting. We return to various 
extensions and sharpenings of Lundberg’s inequality (finite horizon versions, 
lower bounds etc.) at various places and in various settings. 

When comparing different risk models, it is a general principle that adding 
random variation to a model increases the risk. For example, one expects a 
model with a deterministic claim size distribution B, say degenerate at m, to 
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have smaller ruin probabilities than when B is non-degenerate with the same 
mean m. This is proved for the compound Poisson model in IV.8 (see also further 
ordering results for dependent risks in Section XIII.8). Empirical evidence shows 
that the general principle holds in a broad variety of settings, though precise 
mathematical results are not always available. 


4f Statistical methods 


Any of the approaches and results above assume that the parameters of the 
model are completely known. In practice, they have however to be estimated 
from data, obtained say by observing the risk process in [0,7]. This procedure 
in itself is fairly straightforward; e.g., in the compound Poisson model, it splits 
up into the estimation of the Poisson intensity (the estimator is 6 = Nr/T) and 
of the parameter(s) of the claim size distribution, which is a standard statistical 
problem since the claim sizes U1,...,Un, are i.i.d. given Nr. However, the 
difficulty comes in when drawing inference about the ruin probabilities. How 
do we produce a confidence interval? And, more importantly, can we trust the 
confidence intervals for the large values of u which are of interest? In the present 
authors’ opinion, this is extrapolation from data due to the extreme sensitivity 
of the ruin probabilities to the tail of the claim size distribution in particular 
(in contrast, fitting a parametric model to U;,...,Un, may be viewed as an 
interpolation or smoothing of the histogram). For example, one may question 
whether it is possible to distinguish between claim size distributions which are 
heavy-tailed and those that have an exponentially decaying tail. This issue will 
be further discussed in Section X.6. 


4g Simulation 


The development of modern computers has made simulation a popular experi- 
mental tool in all branches of applied probability and statistics, and of course 
the method is relevant in risk theory as well. Simulation may be used just to get 
some vague insight in the process under study: simulate one or several sample 
paths, and look at them to see whether they exhibit the expected behavior or 
some surprises come up. However, the more typical situation is to perform a 
Monte Carlo experiment to estimate probabilities (or expectations or distribu- 
tions) which are not analytically available. For example, this is a straightforward 
way to estimate finite horizon ruin probabilities. 

The infinite horizon case presents a difficulty, because it appears to require 
an infinitely long simulation. Truncation to a finite horizon (or above a certain 
surplus level) has been used, but is not very satisfying. Still, good methods exist 
in a number of models and are based upon representing the ruin probability y(u) 
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as the expected value of a r.v. (or a functional of the expectation of a set of r.v’s) 
which can be generated by simulation. The problem is entirely analogous to 
estimating steady-state characteristics by simulation in queueing/storage theory, 
and in fact methods from that area can often be used in risk theory as well. We 
look at a variety of such methods in Chapter XV, and also discuss how to develop 
methods which are efficient in terms of producing a small variance for a fixed 
simulation budget. A main problem is that ruin is typically a rare event (i.e. 
having small probability) and that therefore it is expensive or even infeasible 
in terms of computer time to obtain reasonably precise estimates of the ruin 
probability through naive simulation. Variance reduction techniques to improve 
the situation are discussed in Chapter XV. 


Chapter II 


Martingales and simple ruin 
calculations 


1 Wald martingales 


A random walk in discrete time is defined as R, = Ro + Yı +--+ Yn where the 
Y; are i.i.d., with common distribution F (say). Here F is a general probability 
distribution on R (the special case of F being concentrated on {—1,1} is often 
referred to as simple random walk or Bernoulli random walk). Most often in 
the probability literature, Ro = 0, but since we are here thinking of the random 
walk as a model for the risk reserve, we often allow Ro = u > 0. Denote by 
Fla] = Ee?” the m.g.f. (moment generating function) of F and let x(a) = 
log F [a] be the c.g.f. (cumulant generating function). 


Theorem 1.1 Let Rn = Ro + ¥it+---+Yn be a random walk. Then for any a 


such that Fla] < co, the sequence 
a(Yite+¥n 
ee ) — eYit +Yn)-nr(a) (1.1) 
Ff[a]” 


is a martingale. 


Proof. Denote by Mn the expression (1.1). Then 


; ral Ga ee 
[Mny | Yi,- Yn] = = E [e Fla] | Yi,- Yn] 
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The martingale in Theorem 1.1 is denoted the Wald martingale. The main 
application is optional stopping, i.e. exploiting the identity 


ret Yit +¥7)—TK(a) =] (1.2) 


where T < œ is a finite stopping time. A sufficient condition for (1.2) is that 


a(Y¥it-:-+Y,)—nK(a) < 0. 


č sup e 
nT 
For a necessary and sufficient condition, see III.1.4. 
The Wald martingale generalizes to a Lévy process {X;}, defined as a contin- 
uous time process with stationary and independent increments. The traditional 
formal definition is that {X+} is R-valued with the increments 


Xt, — Xto Xt. — Xt,,---, Xt, — Xt,_, 


being independent whenever to < tı < ... < tn and with X;, — Xanı having 
distribution depending only on t; —t;-1. An equivalent characterization is {X+} 
being Markov with state space R and 


ol f(Xt4s — Xt) | F] =Eof(Xs), (1.3) 


where E, refers to the case Xo = x. Note that the structure of such a process 
admits a complete description: {X,} can be written as the independent sum of 
a pure drift {ut}, a Brownian component {B+} (scaled by a variance constant 
g) and a pure jump process {M;}, 


Xı = Xo + ut + o Bi + Mi. (1.4) 


More precisely, the pure jump process is given by its Lévy measure v(dxz), a 
positive measure on R with the properties 


a x’v(dx) < o, v(dx) < co (1.5) 


—e {x:|x|>e} 


for some (and then all) € > 0. Roughly, the interpretation is that the rate of 
a jump of size x is v(dx) (if ff, |z|v(dx) = œ, this description needs some 
amendments). The ue case is 8 = a < œ, which corresponds to the 
compound Poisson case: here jumps of {M} occur at rate 8 and have distri- 
bution B = v/ß (in particular, the claim surplus process for the compound 
Poisson risk model, with premium rate p, corresponds to a Lévy process with 
u = —p, o°? = 0 and v = BB). A general jump process can be thought of as 
limit of compound Poisson processes with drift by considering a sequence yo 
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of bounded measures with vy‘ f v. These issues are discussed in more detail 
in XI.1. For the moment, it suffices to have Brownian motion (possibly with a 
non-zero drift u and a general variance constant g?) in mind as a second main 
example besides the compound Poisson model. We have: 


Theorem 1.2 Let {X;} be a Lévy process and a € R. Then Ee*** is either 
finite for allt > 0 or infinite for allt > 0. In the first case, Ee®*+t = e*( for 
some k(a) E€ R, and the process 


e2 Xi tra) (1.6) 


is a martingale. 


Proof. The first part is easily seen to hold with x(a) = log Ee*(*1~*0), For the 
second, denote by M; the expression (1.6) and let {-7;} be the natural filtration 
of {X,}. Then 


(Mits| Ai] = Mı a etre hoes) | | 
= M, rea (Xt+s X1t)—sK(a) = M. 


A sufficient condition for optional stopping, i.e. EM, = 1, is E sup;<, Mi < 
oo. For a necessary and sufficient condition, see again III.1.4. 7 

For Brownian motion with drift u and variance constant o°, X; is N(,07). 
Therefore Ee®*! = e%+0°o"/2 s0 that 


kla) = ap+a’o?/2. (1.7) 


For the Cramér-Lundberg process R; with premium rate p, Poisson parameter 
b and claim size distribution B, it is shown in IV.1 that 


kla) = B(B[-a]—-1) + ap. (1.8) 
For the claim surplus process S; = u — Rg, 
kla) = B(Bla] —1) — ap. (1.9) 


2 Gambler’s ruin. Two-sided ruin. Brownian 
motion 


The first solution of a ruin problem appears to be that of de Moivre (1711) 
for the gambler’s ruin problem, which is a two-sided ruin problem. That is, 
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starting from u € [0,a] we define the two-barrier ruin probability Ya(u) as the 
probability of being ruined before the reserve reaches level a. I.e. 


palu) = P(r(u, a) = T(u)) = 1 —P(r(u, a) = T+(a)), 
where! 
Tu) = inf{t>0: R, <0}, T4la) = inf{t>0: Re >a}, 
T(u,a) = T(u)ATy(a) = inf{t>0: Rp <OorkR, >a}. 
Note that P(r(u,a) < oo) = 1, because the interval [0,a] is finite. Besides 
its intrinsic interest, ¢,(u) can also be a useful vehicle for computing y(u) by 
letting a > co. 

De Moivre considered a Bernoulli random walk, defined as Ro = u with u € 
{0,1,...,a}, Rn =utY ,4+---+Y, where Y1, Yo,... are iid. and {—1, 1}-valued 
with P(Y; = 1) = 0. His result was: 

Proposition 2.1 For a Bernoulli random walk with 0 4 1/2, 
(- =a (+ — r 

0 0 

(- — “) a 1 , 
0 


a 


Palu) = a=u,utl,.... (2.1) 


If 0 = 1/2, then palu) = 


We give two proofs, one elementary but difficult to generalize to other models, 
and the other more advanced but applicable also in some other settings. 


Proof 1. Conditioning upon Y; yields immediately the recursion 


Pall) 1 — 0+ Oy.(2), 
Wa(2) ae (1 — A)a(1) + Ova(3), 


Wala—2) = (1—O)da(a—3) + Ova(a— 1), 
Pala—1) = (1-OA)da(a— 2), 


1Note that the definition of r(u) differs from the rest of the book where we use 
T(u) = inf{t > 0: Re < 0} (sharp inequality); in most cases, either this makes no differ- 
ence (P(R,(1) = 0) = 0) or it is trivial to translate from one set-up to the other, as e.g. in the 
Bernoulli random walk example below. 
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and insertion shows that (2.1) is the solution satisfying the obvious boundary 
conditions Ya(0) = 1, Yala) = 0.7 


The second proof uses martingales. We remark here as a matter of ter- 
minology that whereas our general definition of the Lundberg coefficient y in 
Chapter I is as the non-zero solution of log e751 = 0 where S; = Ro — R; is the 
claim surplus, we work here directly with the reserve process R+, so that for 
our Bernoulli random walk the Lundberg coefficient is the non-zero solution of 
loge~71(41—Fo) = 0, i.e. F[-y] = 1 where F[s] = 0e°+(1—6)e~*. In view of the 
discrete nature of a Bernoulli random walk, we write z = e77. The Lundberg 
equation is then 


A 1 
1 = F[-y| = 6z2+(1-6)- 
z 
with solution z = (1 — 0)/0. 


Similar remarks apply when computing the Lundberg coefficient for Brown- 
ian motion below. 


Proof 2. Wald’s exponential martingale with a = —y is just {eV } = ea K 
By optional stopping, 


eh = Ez™® = Ezra) 


= 2P (Reu) =0) + 2° P (Re(ua) = 0) 
2° palu) + 2° (1 — Yalu)), (2.2) 
and solving for Yalu) yields Yalu) = (2% — 2“)/(z* — 1). 


If 0 = 1/2, (2.2) is trivial (z = 1). However, {Rn} is then itself a martingale 
and we get in a similar manner 


a— u 


u = ERo = Rra a = 0- palu) + a(1 — Yalu)), Yalu) = 


a 


We note that if 0 < 1/2, then a Bernoulli random walk hits 0 w.p. 1 so 
w(u) = 1 for u > 0. In contrast: 


Corollary 2.2 For a Bernoulli random walk with 0 > 1/2, 


TA ea) 


If 0 < 1/2, then y(u) =1. 


2 Alternatively, a constructive solution of the difference equation a(x) = (1 — @)Wala — 
1) + O%a(x + 1) is to substitute r”, leading to the two choices r = 1 and r = (1 — 0)/0 and 
the result then follows from their linear combination determined by the boundary conditions. 
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Proof. Let a — oo in (2.1). 


Now let us turn to Brownian motion. 


Proposition 2.3 Let {R;} be Brownian motion starting from u > 0 and with 
drift u and variance constant o?. Then for u #0, 


2 
eo 2Ha/o” rae e7 2uu/o 


Palu) = 


ae (2.4) 


If p = 0, then pa(u) = 

a 
Proof. By (1.7), the Lundberg equation «(—y) = 0 is y207/2 — yu = 0 with 
solution y = 24/07. Applying optional stopping to the exponential martingale 
{e~VF } yields 


eo = Ee Ro = Re Ira) = epa (u) + e71 — Ha(u)), 


and solving for Ya(u) yields Yalu) = (e~ 7% — e7”) /(e~ 7 — 1) for u #0. 
If u = 0, {R;} is itself a martingale and just the same calculation as in the 
proof of Proposition 2.1 yields wa(u) = (a — u)/a. 


Corollary 2.4 For Brownian motion starting in u > 0 with drift u > 0 and 
variance constant o7, 


plu) = e7? (2.5) 
If u <0, then y(u) = 1. 


Proof. Let a — oo in (2.4). 


The reason that the calculations work out so smoothly for Bernoulli ran- 
dom walks and Brownian motion is the skip-free nature of the paths, implying 
Re(u,a) = a on {T(u,a) = T+ (a)} and similarly for the boundary 0. For most 
standard risk processes, the paths are upwards skip-free but not downwards, 
and thus one encounters the problem of controlling the undershoot under level 
0. Here is one more case where this is feasible: 


Example 2.5 Consider a compound Poisson model with exponential claims 
(with rate, say, ô). That is, 


Nz 
R = u+t—SoU; 
i=l 
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where N is a Poisson(@) process and P(U; > x) = e°. Now consider R,(x), 


assume R,(,) = © > 0, and let Z = —R,(,) +x be the size of the claim leading 
to ruin. The available information on Z is that its distribution is that of a claim 
size U given U > x, and thus by the memoryless property of the exponential 
distribution, the conditional distribution of Z is again just exponential with rate 
6. Hence 


e der Vo 


= ife YRr(u,a) 


Re(u,a) £ 0] P(Rr(u,a) < 0) per P(Rr(u,a) = a) 


ô 
= -y PCR Gia < 0) +e 7% P(Rz(u,a) = a) 


Ż E e™ (1 — pa(u)). 


It follows from (1.8) that y = 6 — 8, from which we obtain 


S eo Ie 


Yalu) = apa | 


(2.6) 


Again, letting a — co yields the classical expression pe~”™ for y(u) where 
p = 3/0 (cf. 1.4b), valid if p < 1 (otherwise, y(u) = 1). 


However, passing to even more general cases the method quickly becomes 
unfeasible (see, however, I[X.5a). It may then be easier to first compute the 
one-barrier ruin probability y(u): 


Proposition 2.6 If the paths of {Ri} are upward skip-free and w(a) < 1, 


vu) = pla) 
1- yla) ’ 


Proof. By the upward skip-free property, 
b(u) = palu) + (1 — da(u)) Ya). 
If (a) < 1, this immediately yields (2.7). 


Yalu) = O<u<a. (2.7) 


We will meet this argument again in VIII. la for computing ruin probabilities 
for a two-step premium function. 


Let us now return to Bernoulli random walk and Brownian motion and 
consider finite horizon ruin probabilities. For the symmetric (drift 0) case these 
are easily computable by means of the reflection principle: 
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Proposition 2.7 For Brownian motion starting in u > 0 with drift 0 and vari- 
ance constant o°, 


Plu, T) = P(r(u) <T) = 26 (2.8) 


an 
oVT/ 
Proof. In terms of the claim surplus process {S;} = {u — Re}, we have y(u, T) = 
P(Mr > u) where Mr = maxo<t<r S+. Here {S;} is Brownian motion with drift 
0 (starting from 0), in particular symmetric so that from time T(u) (when the 
level is u) it is equally likely to go to levels < u and levels > u in time T — T(u). 
Hence P(Mr > u, Sr < u) = P(Mr > u, Sr > u), and one gets 


IŞ 


P(Mr >u) = P(Sr > u)+P(Sr <u, Mr > u) 
= P(Sr > u)+P(Sr >u, Mr > u) 
= P(Sr > u)+P(Sr >u) (2.9) 
= 2P(Sr >u) 


Small modifications also apply to Bernoulli random walks: 


Proposition 2.8 For the Bernoulli random walk with 0 = 1/2, 
plu, T) = P(Sr = u) +2P(Sr > u), (2.10) 


whenever u, T are integer-valued and non-negative. Here 


T 
27T >, v=-T,-T+2,...,T—2,T, 


0, otherwise. 
Proof. The argument leading to (2.9) goes through unchanged, and (2.10) is the 


same as (2.9). The expression for P(Sr = v) is just a standard formula for the 
binomial distribution. 


Notes and references All material of the present section is standard. A classical 
reference for further aspects of Bernoulli random walks is Feller [361]. For gener- 
alizations of Proposition 2.6 to Markov-modulated models, see Asmussen & Perry 
[95]. Further early references on two-barrier ruin problems include Dickson & Gray 
[312, 313]. 
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3 Further simple martingale calculations 


We consider the claim surplus process {S+} of a general risk process. As usual, 
the time to ruin r(u) is inf {t >0: S, > u}, and the ruin probabilities are 


y(u) =P(r(u) < œ), plu, T) = P(r(u) < T). 


Our first result is a representation formula for (u) obtained by using the 
martingale optional stopping theorem. Let €(u) = S+(u) — u denote the over- 
shoot. 


Proposition 3.1 Assume that (a) for some y > 0, et a is a martingale, 
(b) S, “3 —œ on {r(u) = œ}. Then 
e 


v(u) = 5 [eve | 7(u) z o] > u>0. (3.1) 


Proof. We shall use optional stopping at time t(u) AT. 3 We get 


1 = Ee? = BFermar 


= Efe’; r(u) <T] +E[e%?; r(u) >T]. (3.2) 


As T — œ, the second term converges to 0 by (b) and dominated convergence 
(e75T < e™ on {r(u) > T}), and in the limit (3.2) takes the form 


— a [eVerts); T(u) < oo] +0 
— et rfe, t(u) < co] =o” [e7 | 7(u) < oo] y(u). 


Example 3.2 Consider the compound Poisson model with Poisson arrival rate 
G, claim size distribution B and p = upg < 1. Thus 


Ni 
S.= SU: -t, 
j= 


where {N;} is a Poisson process with rate 8 and the Uj are i.i.d. with common 
distribution B (and independent of {N;}). 

Condition (a) of Proposition 3.1 is satisfied, if a positive solution y > 0 to 
the Lundberg equation (1.9) (i.e. an adjustment coefficient) exists. Property 
(b) follows from p < 1 and the law of large numbers (see Proposition IV.1.2(c)). 


3We cannot use the stopping time T (u) directly because P(T (u) = 00) > 0 and also because 
the conditions of the optional stopping theorem present a problem; however, using T(u) AT 
invokes no problems because T(u) ^ T is bounded by T. 
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Example 3.3 Assume that {R;} is Brownian motion with variance constant 
o? and drift u > 0. Then {S;} is Brownian motion with variance constant o? 
and drift —u < 0. Since the positive solution to the Lundberg equation (1.7) is 


y = 2/07, the conditions of Proposition 3.1 are satisfied. 


Corollary 3.4 (LUNDBERG’S INEQUALITY) Under the conditions of Proposi- 
tion 3.1, y(u) <e~™ for all u > 0. 


Proof. Just note that €(u) > 0 a.s. 


We also retrieve again the exact expression of Section [.4b for exponential 
claims: 


Corollary 3.5 For the compound Poisson model with B exponential, B(x) = 
e%*, and p= 8/6 <1, the ruin probability is y(u) = pe~™ where y = ô — B. 


Proof. As before, from (1.9) it is immediately seen that y = 6 — 8. Now 
at the time r(u) of ruin, {S+} upcrosses level u by making a jump. By the 
memoryless property of the exponential distribution, the conditional distribution 
of the overshoot ¿(u) is again just exponential with rate ô. Thus 


z [e7 | z(u) < oœ] = i, et $e de = | de? dx = a = = 
0 0 p P 


If {R:} is Brownian motion with variance constant o? and drift u > 0, 
then E(u) = 0 by continuity of Brownian motion and y(u) = e~2#/°", which 
reconfirms Corollary 2.4. 

Notes and references The first use of martingales in risk theory is due to Gerber 
[397], and is further exploited in his book [398]. More recent references are Dassios & 


Embrechts [273], Grandell [429, 430], Embrechts, Grandell & Schmidli [345], Delbaen 
& Haezendonck [287] and Schmidli [770, 780]. 


4 More advanced martingales 


4a Generators. The Dynkin martingale 


Assume that a stochastic process {R+} is a Markov process and write P, and 
ču for the case Ro = u. In loose terms, the generator @ is then an operator in 
an appropriate function space such that 
d 
—E, f(R 
dt t=0 
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or equivalently 


im = fu) (4.1) 


for a sufficiently rich class of functions f. 

Historically, there have, however, been several different ways to make this 
loose definition precise, and in particular, one will find many definitions of the 
domain A.) on which © is defined. For example, some older definitions require 
(4.1) to hold uniformly or locally uniformly in u. The most standard current 
definition is in terms of martingales: f E€ (A with xf = g (g can be shown to 
be unique up to some null set complications (cf. Davis [278, p.32]) if and only 
if 


KRJ- I SRi (4.2) 


is a local martingale*, commonly referred to as the Dynkin martingale. The 
motivation relating to (4.1) is loosely the following. Denote by M the expression 
(4.2), assume it is a martingale (and not just a local martingale) and that the 
function s > E,.@f(R,) is sufficiently well-behaved at s = 0, say continuous 
and bounded. Then 


II 


h 
A OEE — Mo] + Ey f of (R,) ds 
0 + ho/f(u) + olh) 


so that (4.1) holds. 


Example 4.1 Assume that {R+} is Brownian motion with drift u and variance 
constant o?. Let further f € C? have compact support. Then if V is a standard 
normal r.v., we have? 


tut (Rn) = Ef(ut pat VhoV) 
= E[f(u) + f'(w[uh + VaoV] + f"(u) [ph + VhoV]? /2 + O(h2)] 
= f(u)+ f'(u)uh + f" (u)ho? EV?/2 + O(n?) . 


Thus (4.1) holds with 
of = wf! + (02/2) f". 


4Strictly speaking, a local martingale w.r.t. Pz for all x. For ease of exposition, we omit 
such specification here and in the following. 

5This calculation is of course a heuristic derivation of It6’s formula. In its full generality 
Itô’s formula permits to weaken the assumption on f to f € C?. 
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It is less clear how much one can relax the assumptions on f to still get a local 
martingale and we will not go into this issue here. Nevertheless, it is clear that 
for a suitable class of twice differentiable functions f, one should have f € A.A 
and that wf = uf! + (07/2) f". 

The operator sending a twice differentiable function f to pf’ + (07/2) f" 
is often called the differential operator of the Brownian motion. Similarly, a 
diffusion process with local drift (uw) and local variance o?(u) has differential 


operator u(u) f’ (u) + (02(u)/2) f” (u). 


Example 4.2 Consider the Cramér-Lundberg model with parameters 3, B and 
let U be a generic claim. Then, conditioning on whether or not a claim occurs 
in (0,h), we have under suitable conditions on f 


bud(Rn) = (1—(h)f(u+h) + BhEf(u-—U+O(h)) + o(h) 
= flu) —Bf(u)h-+hf"(u) + Bh f f(u- £) B(dz) + o(h). 


Thus, it is clear that for a suitable class of differentiable functions f, one should 
have f € A) and that 


of(u) = —Bf(u) + f'(w +8 l Fab (aa): 


A function f such that f(R+) is a martingale is called harmonic. Obviously, 
in view of (4.1) a harmonic function will have the property f € A&) and 
Af =0. 


Example 4.3 As a simple example of the relevance of harmonic functions for 
ruin theory, consider the Brownian setting of Example 4.1 and take f(u) = y(u) 
(the ruin probability), with the convention ~(u) = 1 for u < 0. For u > 0, 
P(t(u) < h) = o(h) ê and so by boundedness of y(u) and the Markov property 
of Ri, 


plu) = Eu [p(Ra); T(u) >h] + Eu|I(T(u) < h)] = Eud(Ra) + olh). 


The same holds for u < 0, and so (omitting the details for u = 0) we conclude 
that (wu) is harmonic. Combining this with the remarks of Example 4.1, we 
conclude that 

py (u) + Y” (u)o? /2 = 0. 


®This for instance follows from III.(1.8). 
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The general solution of this differential equation is Ceò” + C_e*-", where 
A+ are the solutions of the quadratic equation 0 = Au + A?/2, i.e. Ay = 0 and 
A. = —2p/o?. If u > 0, we have y(u) — 0 as u — œ, and so Cy = 0. Together 
with the boundary condition ~(0) = 1 (due to the oscillation) we arrive at 
y(u) = e~2#4/2" as found in Corollary 2.4 by different means. 

The method employed may be seen as a continuous time analogue of Proof 
1 of Theorem 2.1. 


Notes and references Ethier & Kurtz [359] is a good reference for the modern 
approach to generators. Further references are Davis [278] and Rolski et al. [746]. 


4b Diffusions and two-sided ruin 


One of the major areas where generators play a main role is diffusion processes. 
Thus let {R;} be a diffusion on [0, o0) with drift u(x) and variance o?(x) at x. 
We assume that u(x) and o?(a) are continuous with o?(x) > 0 for x > 0. Thus, 
close to x, {R:} behaves as Brownian motion with drift u = u(x) and variance 
a? = o? (x). We can define a ‘local’ adjustment coefficient y(x) = —2(ax) /o?(x) 


for the locally approximating Brownian motion. Let 


s = efo (2) dz z) = r oo) = ia ; i 
(y) =e ` Se) [ sees. S(cv) f (w)dy. (43) 


The following result gives a complete solution of the ruin problem for the dif- 
fusion subject to the assumption that S(x), as defined in (4.3) with 0 as lower 
limit of integration, is finite for all x > 0. If this assumption fails, the behav- 
ior at the boundary 0 is more complicated and it may happen, e.g, that y(u), 
as defined above as the probability of actually hitting 0, is zero for all u > 0 
but that nevertheless R; S 0 (the problem leads into the complicated area of 
boundary classification of diffusions, see e.g. Breiman [199] or Karlin & Taylor 
[522, p. 226]). 


Theorem 4.4 Consider a diffusion process {R:} on [0,00), such that the drift 
u(x) and the variance o7(x) are continuous functions of x and that o?(x) > 0 
for x > 0. Assume further that S(x) as defined in (4.3) is finite for all x > 0. 
If 


Sco) < œ, (4.4) 
then O < y(u) <1 for all u > 0 and 


Conversely, if (4.4) fails, then y(u) = 1 for all u > 0. 
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Lemma 4.5 Let0<b<u<a and let Ya plu) be the probability that {R+} hits 
b before a starting from u. Then 


Peal) = Sa) = 5) 


= 
| 

U 

= 


Proof. Recall that under mild conditions on q, Euq( Rn) = q(u) + vq(u)h+o(h), 
where 


ex) = gu) + lagu). 


If b < u <a, the probability of ruin or hitting the upper barrier a before h is of 
order o(h), so that 


Pa plu) = Euthao(Rn) + o(h) = pa plu) + Apa blu)h + o(h), 


ie Apalu) = 0. Using s/(x)/s(x) = —2u(x)/o?(x), elementary calculus shows 
that we can rewrite &/as 


1 d fg'(u) 

Aqlu) = =o? pas ; 4. 
alu) = 3P 5 GT. (4.6) 
Hence Apa plu) = 0 implies that Y ,(u)/s(u) is constant, i.e. pa p(u) = a+ 
BS(u). The obvious boundary conditions Ya b(b) = 1, Ya, b(a) = 0 then yield the 
result. 
Proof of Theorem 4.4. Letting b | 0 in (4.5) yields Yalu) = 1 — S(u)/S(a). 
Letting a T co and considering the cases S(co) = oo, S(oo) < œ separately 
completes the proof. 


Notes and references A good introduction to diffusions is in Karlin & Taylor 
[522]; see in particular pp. 191-195 for material related to Theorem 4.4. In view o 


(4.5), the function S(x) is referred to as the natural scale in the general theory of 
diffusions (in case of integrability problems at 0, one works instead with a lower limit 
€ > 0 of integration in (4.3)). Another basic quantity is the speed measure M, defined 
by the density 1/(o?(u)s(u)) showing up in (4.6). For related arguments concerning 
the local adjustment coefficient, see VIII.3. 


4c The Kella-Whitt martingale 


Example 4.6 As a motivating example, consider the surplus process {S;} of a 
Cramér-Lundberg model with a claim size distribution B that is a mixture of 
two exponential distributions. To be definite, let the density be 


1 


1 
b(x) = —3e° 3" + sre 


2 
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and let the Poisson parameter be @ = 3. Then 


1 3 1 7 
kla) = 3(53 ee 1) =a. (4.7) 
Thus a solution of the Lundberg equation «(a) = 0 must satisfy 
9 21 
0 = ac a) 4 5 (3 — a) — 3(3 — a)(7 — a) — a(3 — a)(7 — a). 


This is a cubic equation, and roots are easily seen to be 0, y = 1,7* = 6; that 
the Lundberg coefficient y must be 1 and not 6 follows since Ee®*: is only finite 
if a < 3. Consider the problem of two-sided exit of {S+} from [—a, 6] with 
a,b > 0. Let po be the probability of exit at the lower barrier and ps, py the 
probabilities of exit at the upper barrier as result of a jump which is exponential 
with rate 3 and 7, respectively. For brevity, write rT = T(u,a). Now S, equals 
0 for lower exit and b + V3 for upper exit as result of an exponential(3) jump 
(where V3 is again exponential(3) due to the lack-of-memory property of the 
exponential distribution). Defining V7 similarly, we get by optional stopping of 


ferns) 


+ prev? 


1 = e0 = Eer = poe + pze” f 
3-7 7-7 


6 = 6Bpoe 7 +9p3e° + 7 pre’. (4.8) 


Also obviously 1 = po +p3+p7. To get the missing third equation, it is tempting 
to formally proceed with y* as with y, which would give 


* * * * 3 * T 
1 = e% = Ee Sr = poe t + pz e7 ?— +p7e” ?—, (4.9) 
3-7 7-7 


1 = poe °* + e%(—p3 + 7p7). (4.10) 


The problem is, however, that Ee’ %* = oo so {e7 5} is not a martingale (or 
for that sake a local martingale), so we are missing a rigorous justification for 
(4.9). 


We will see in Example 4.9 that the solution is nevertheless correct. For 
this and other purposes, we will exploit a martingale introduced by Kella & 
Whitt [527]. Let {R:} be a Lévy process with Lévy exponent x(a). The Wald 
martingale is then M, = e®*«—**(), The Kella-Whitt martingale is a stochastic 
integral w.r.t. {M;} and has a somewhat different range of applications; in 
particular, it allows for a more direct study of aspects of reflected Lévy processes. 
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Theorem 4.7 Let {R,} be a Lévy process with Lévy exponent (a), let 
Y, = [ dye + X Ay, 
O0<s<t 
be an adapted process of locally bounded variation with continuous part {Y£}, 


D-paths and jumps AY, = Y, — Y,_, and define Z, = x + Ri + Yı. For each t, 
let K, be the r.v. 


$ 
K, = f e2: dM, 
0 


t 
OJI eds + e°” — Zita f cick E 5 e2: (1 ae PO), 
0 


0 0<s<t 


II 


Then {K;} is a local martingale whenever k(a) is well-defined. 


Proof. Let B; = e*¥*+'*(®), Then, by the general theory of stochastic integra- 
tion, Kf = IM Bs- dM, is a local martingale. Using the formula for integration 
by parts (see e.g. Protter [718, p.60] for a version sufficiently general to deal 
with the present case) yields 


M,Bı — MoBo = [ M,-dB, + Kj + S> AM.ABg. 


O0<s<t 


Inserting 


t 
XC AM.AB, = [ AM,dB, = J M,- Mae, 
0<s<t 0 
it follows that , 
-Ký = J M, dB; + MoBo — M,B;. (4.11) 
0 


Using M,B, = e°4s and dB, = B, (a dY£ + K(a)ds + 1 — @ oaks) shows that 
the r.h.s. of (4.11) reduces to Ky. 

For practical purposes, the main application is optional stopping which is 
often verified via the following standard result from martingale theory: 


Theorem 4.8 If for a given t we have Esup,<;|Ks| < 00, then {K+} is a proper 
martingale. Further, let T be a stopping time such that E Usup,<, |K] < 00. Then 
K, = EKy = 0. 
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Example 4.9 Consider again the mixed exponential setting of Example 4.6. 
Take Y, = 0. Thus we have simply 


t 
Kı =la) fers: ds +1— e°% 
0 


whenever a < 3. Noting that —a < Ss < b, we get 


IKs] < |r(ax)|sel! maxtab) 4 1 4 elel) 4 elalO+¥r), 


Since Er < oo, this shows the conditions of Theorem 4.8 for |a| < 3 and gives 
as in (4.9) that 


3 7 
— ra -aa ab = —ab 4.12 
0 = K(a)¢(a) + 1— poe pao 5 — pre Tr (4.12) 


where ¢(a) = E ie e*: ds. The same bound as used above easily gives that 
o(a) is defined for all a € R, not only for |a| < 3, and is an analytic function 
of a. Since everything else in (4.12) is analytic in the region Q = R\{3,7} 
(think of K(s) as the r.h.s. of (4.7), i.e. the analytic continuation of log Ee***), 
we conclude that (4.12) holds in the whole of Q. Taking a = y* = 6 we get the 
desired rigorous proof of (4.9). 


The picture that emerges is that all roots of the analytic continuation of 
log Ee*(*1—®0) may enter in ruin formulas, and we will see several examples of 
this, in particular when phase-type distributions are involved (see e.g. XI.5 for 
a much more elaborate version of Example 4.9). 

The problem with the Wald martingale is of course that one only gets (4.9) 
for 0 and y, and this is not enough to do an analytic continuation. Other values 
of a require conditional expectations of e°7 given the type of exit, and we are not 
aware of how to approach these via the Wald martingale. However, Example 4.9 
shows how to solve the problem via the Kella-Whitt martingale. 


Notes and references Optional stopping of the Kella-Whitt martingale is further 
discussed in Asmussen & Kella [85], and Markov-modulated versions of the martingale 
are in Asmussen & Kella [84]. A variety of different applications are surveyed in 
[APQ, Ch. IX.3]. 
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Chapter III 


Further general tools and 
results 


1 Likelihood ratios and change of measure 


We consider stochastic processes {X+} with a Polish state space E and paths 
in the Skorohod space Dg = Dg{0,00), which we equip with the natural fil- 
tration {F;},.,) and the Borel o-field F. Two such processes may be rep- 
resented by probability measures P, P on (De, F), and in analogy with the 
theory of measures on finite-dimensional spaces one could study conditions for 
the Radon-Nikodym derivative dP/dP to exist. However, as shown by the fol- 
lowing example this set-up is too restrictive: typically', the parameters of the 
two processes can be reconstructed from a single infinite path, and P, P are then 
singular (concentrated on two disjoint measurable sets). 


Example 1.1 Let P, P correspond to the claim surplus process of two com- 
pound Poisson risk oe with Poisson rates 8, 3 and claim size distributions 
B, B. The number NYP of jumps > e before time t is a (measurable) r.v. on 
(De, F), hence so is N; = lime)o NE . Thus the sets 


are both in F. But if 8 # B, then S and S are disjoint, and by the law of 


Though not always: it is not difficult to construct a counterexample say in terms of 
transient Markov processes. 
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large numbers for the Poisson process, P(S) = P(S) = 1. A somewhat similar 


argument gives singularity when B # B. 


The interesting concept is therefore to look for absolute continuity only on 
finite time intervals (possibly random, cf. Theorem 1.3 below). I.e. we look for 
a process {L;} (the likelihood ratio process) such that 


P(A) =E|L;; A], AE F, (1.1) 


(i.e. the restriction of P to (Dg, F+) is absolutely continuous w.r.t. the restriction 
of P to (Dz, F+)). 
The following result gives the connection to martingales. 


Proposition 1.2 Let {F;},.9 be the natural filtration on Dp, F the Borel o- 
field and P a given probability measure on (Dg, F). 
(i) If {Li},s9 is a non-negative martingale w.r.t. ({F,},P) such that EL; = 1, 


then there exists a unique probability measure P on F such that (1.1) holds. 


(ii) Conversely, if for some probability measure P and some {.¥;}-adapted process 
{Li}iso (1-1) holds, then {Ly} is a non-negative martingale w.r.t. ({A:} ,P) 
such that EL, = 1. 


Proof. Under the assumptions of (i), define P by P,(A) = E[L;; A], A € Fe. 
Then L; > 0 and EL; = 1 ensure that P; is a probability measure on (Dz, F+). 
Let s < t, AC Fs. Then 


P(A) = E[L,; A] = EE[LI(A)| Fs] = EI(A)E[Li|F,] 
= EI(A)L, = P(A), 


using the martingale property in the fourth step. Hence the family {P,} isois 


consistent and hence extendable to a probability measure P on (De, F) such 
that P(A) = P,(A), A € F. This proves (i). 

Conversely, under the assumptions of (ii) we have for A € F, and s < t that 
A € F, as well and hence E[L,; A] = E[L;; A]. The truth of this for all A € Fs 
implies that E[Z;|4%,] = Ls and the martingale property. Finally, EZ; = 1 
follows by taking A = Dz in (1.1) and non-negativity by letting A = {L; < 0}. 
Then P(A) = E[L,; Li < 0] can only be non-negative if P(A) = 0. 


The following likelihood ratio identity (typically with 7 being the time 7(u) 
to ruin) is a fundamental tool throughout the book: 
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Theorem 1.3 Let {Li}, P be as in Proposition 1.2(i). If r is a stopping time 
and G € Fr, GC {rT < œ}, then 


az a]. (1.2) 


Further, if Z is a r.v. which is F,-measurable and 0 on the set {r = co}, then 


Z = E[Z/L,]. (1.3) 


Proof. Assume first G C {r < T} for some fixed deterministic T < oo. By the 


martingale property, we have E[Lr|F-] = L+ on {r < T}. Hence 
~ri rr 1 
i|—; G|) = E|; G| = E/—1(Q)E[Lr|F, 
Fa c| am, c| ie AERA 
r1 
= E = 1(G)L,] = P(G). (1.4) 


In the general case, applying (1.4) to GN {r < T} we get 


1 . 
Lr’ 


GA{T<TY}]. 


P(GN{r<T}) = E| 


Since everything is non-negative, both sides are increasing in T, and letting 
T — œ, (1.2) follows by monotone convergence. 

For (1.3), just use standard measure-theoretic arguments to extend from 
indicators to r.v.’s. 


A main example of change of measure is to take {L+} as the Wald martingale 
e?X:—t*() in the case where {X;} is a random walk or a Lévy process. We 
then write Pg rather than P, and talk about exponential change of measure 
or exponential tilting. We will see in Section 3 that {X,} remains a random 
walk/Lévy process under Po, only with changed parameters. A first elegant 
application of the change-of-measure technique is the following observation: 


Corollary 1.4 The necessary and sufficient condition for optional stopping of 
the Wald martingale, i.e. 1 = Ee?X*-7™*® for a given stopping time with T < 
oo, is that P(t < œ) = 1. 


Proof. Let 


Z=L, = ef X7—TKO) = eX TH) T(r < o0). 


Then the assertion means EZ = 1, whereas (1.3) gives EZ = P(r < ov). 
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From Theorem 1.3 we obtain a likelihood ratio representation of the ruin 
probability y(u) parallel to the martingale representation II.(3.1) in Proposition 
11.3.1: 


Corollary 1.5 Under condition (a) of Proposition 11.3.1, 
plu) = e ME,[e-%™; z(u) < oo. (1.5) 


Proof. Letting G = {r(u) < oo}, we have P(G) = y(u). Now just rewrite the 
r.h.s. of (1.2) by noting that 


= erm) = eT Men VEU) | 


Lu) 


The advantage of (1.5) compared to II.(3.1) is that it seems in general easier 
to deal with the (unconditional) expectation E,[e~7§™); r(u) < oo] occurring 
there than with the (conditional) expectation Efe“ |r(u) < oo] in IL(3.1). 
The crucial step is to obtain information on the process evolving according to 
P, and this problem will now be studied, first in the Markov case and next 
(Sections 3, 4) for processes with some random-walk-like structure. 

As another simple application of the change-of-measure technique, we shall 
establish a formula for the finite-time ruin probability of Brownian motion: 


Corollary 1.6 Let {R,} be Brownian motion with drift u and variance constant 
1. Then the density and c.d.f. of T(u) are 


— Tu)? 
P,(r(u) €dT) = ae exp{ h (1.6) 
T 
P,(r(u)<T) = 1- 1 = pvT) + pre = pv). (1.7) 
For a general variance constant o?, one furthermore obtains 
—u + uT E N E uvT 
P <T) = &| — = pu | ——_—__—_ } 1.8 
CO a a (1.8) 


Proof. Consider first the case o? = 1. For p = 0, (1.7) is the same as II. (2.8), 
and (1.6) follows then by straightforward differentiation. For u Æ 0, the ratio 
dP,,/dPo of densities of S; is e#S+—t#"/2, since «(0) = —O + 6202/2. Hence 


P,,(r(u) € dT) 


Zo [et Seo =T HH" /2. r(u) € dT] 


ete TH"/2B, (7(u) € dT) 


II 


= uu—-Tp?/2_U pT~3/2 {_*. 
e aR exp —5 


k 


SIS, 
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which is the same as (1.6). (1.7) then follows by checking that the derivative of 
the r.h.s. is (1.6) and that the value at 0 is 0. 

For a general o?, a completely analogous calculation can be done, but the 
analogue of (1.6) is more tedious to write down, so we omit the details. 


The same argument as used for Corollary 1.6 also applies to Bernoulli random 
walk with 6 4 1/2, but we again omit the details. 


Consider next a (time-homogeneous) Markov process {X;} with state space 
E, say, in continuous time (the discrete time case is parallel but slightly simpler). 
In the context of ruin probabilities, one would typically have X; = Ri, Xi = 
St, Xt = (Jt, Ri) or Xe = (Je, St), where {R;} is the risk reserve process, 
{S+} = {u — Ri} the claim surplus process and {J+} a process of supplementary 
variables possibly needed to make the process Markovian. A change of measure 
is performed by finding a process {L+} which is a martingale w.r.t. each P}, is 
non-negative and has E} L+ = 1 for all x,t. The problem is thus to investigate 
which characteristics of {X+} and {Z,} ensure a given set of properties of the 
changed probability measure. 

First we ask when the Markov property is preserved. To this end, we need 
the concept of a multiplicative functional. For the definition, we assume for 
simplicity that {X;} has D-paths, is Markov w.r.t. the natural filtration {F+} 
on Dg and define {L,} to be a multiplicative functional if {L+} is adapted to 
{F}, non-negative and 


Lips = Li 7 (Ls O 64) (1.9) 


P,-a.s. for all x, s,t, where 0, is the shift operator. The precise meaning of this 
is the following: being -¥;-measurable, L; has the form 


It = Pt ({Xuhoeu<s) 


for some mapping yı: Dg[0, t] > [0, 00), and then 


Lob, = Ps ({Xtruho<u<e) ; 


Theorem 1.7 Let {X;} be Markov w.r.t. the natural filtration {F} on Dp, 
let {L} be a non-negative martingale with E,L; = 1 for all x,t and let P, be 
the probability measure given by P(A) = E,[L,; A]. Then the family Pet o 
defines a time-homogeneous Markov process if and only if {L,} is a multiplicative 
functional. 


Proof. Since both sides of (1.9) are F;4, measurable, (1.9) is equivalent to 


le [Lips Vi+s] = Dr [L 7 (Ls O Ot) Vits] (1.10) 
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for any ¥;4,-measurable r.v. V;+s, which in turn is the same as 


ye [Le LsZt: (Ys 0 0,)] = Ey [Lt - (L; 0 0i) Zi- (Yz 0 4,)] (1.11) 


for any ¥;-measurable Z, and any ¥,-measurable Y,. Indeed, since Z- (Y; o 04) 
is Y,4,-measurable, (1.10) implies (1.11). The converse follows since the class 
of r.v.’s of the form Zs- (Ys o 04) comprises all r.v.’s of the form JJ} fi(Xew) 
with all t(i) < t+ s. 

Similarly, the Markov property can be written 


Es [Ys 0 0il Fi = Ex,Ys, t<s, 


for any #,-measurable r.v. Y, which is the same as 


E,[Z:(Ys06)| = Ep[ZEx, Ye] 


for any ¥;-measurable r.v. Z+. By definition of P,, this in turn means 


ve [Le - s Z(Y; 9 0:)] = E, [LZ UX, [LY]; 


or, since Ex,[L,Y,] = E[(Ys o 0) (Ls 0%) | Fi], 


r |Lt+s Z(Y: 0 0)] = Es|Li Z(Y: 0 0)(Ls 0 6)], (1.12) 


which is the same as (1.11). 


Remark 1.8 For {P.}, eR tO define a time-homogeneous Markov process, it 
suffices to assume that {L+} is a multiplicative functional with E, 2, = 1 for all 
x,t. Indeed, then 


[Lips Fz] = Li SLs © Ul F] = Li Ux, Ls = L, 


(using the Markov property in the second step) so that the martingale property 
is automatic. 


Notes and references The results of the present section are essentially known 
in a very general Markov process formulation, see Dynkin [338] and Kunita [562]. 
A more elementary version along the lines of Theorem 1.7 can be found in Küchler 
& Sørensen [561], with a proof somewhat different from the present one. A further 
relevant reference is Barndorff-Nielsen & Shiryaev [139]. 
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2 Duality with other applied probability models 


In this section, we shall establish a general connection between ruin probabilities 
and certain stochastic processes which occur for example as models for queueing 
and storage. The formulation has applications to virtually all the risk models 
studied in this book. 

The result is a sample path relation, and thus for the moment no paramet- 
ric assumptions (on say the structure of the arrival process) are needed. We 
work on a finite time interval [0, T] in the following set-up (which can be much 
generalized): 


The risk process {R;},2,<7 has arrivals at epochs o),...,0y, 0< 01 <... 
< oy < T. The corresponding claim sizes are U,,...,Un. In between 
jumps, the premium rate is p(r) > 0 when the reserve is r (i.e., R = p(R)). 
Thus 


t 
Ri = Ro+ f p(Rs)ds— A, where Ay = 5 Ux. (2.1) 
0 


k: on <t 


The initial condition is arbitrary, Ro = u (say), and the time to ruin is 
T(u) = inf {t > 0: R <0}. 


The storage process {V;})<,< is essentially defined by time-reversion, re- 
flection at zero and initial condition Vo = 0. More precisely, the arrival 
epochs are of,...,7x, where o% = T—on_441, and just after time of {Vi} 
makes an upwards jump of size Uf = Un—x41. In between jumps, {V;} 
decreases at rate p(r) when V; = r (i.e, V = —p(V)). That is, instead of 
(2.1) we have 


t 
V, = a-f p(Vs)ds where AF = Ñ` Ug = Ar- Ar- 
0 


k: o% <t 


and we use the convention p(0) = 0 to make zero a reflecting barrier (when 
hitting 0, {V;} remains at 0 until the next arrival). 


Note that these definitions make {R+} right-continuous (as standard) and {V;} 
left-continuous. The sample path relation between these two processes is illus- 
trated in Fig. II.1. 
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FIGURE III.1 


Define r(w) = inf {t > 0: R; < 0} (7(u) = co if R; > 0 for all t < T) and let 
w(u,T) = P( int Ri < 0) = P(r) <T) 


be the finite-time ruin probability. 


Theorem 2.1 The events {T(u) < T} and {Vr > u} coincide. In particular, 
plu, T) = P(Vr >u). (2.2) 


Proof. Let r denote the solution of R = p(R) subject to 7) =u. Then rl) 
> r™ for all t when u > v. 
Suppose first Vr > u (this situation corresponds to the solid path of {R;} in 


Fig. II.1 with Ro = u = u1). Then 
Voz, = WP) -U > rẹ® -U = Roy. 


If Vos, > 0, we can repeat the argument and get Vss, > Ro, and so on. Hence 
if n satisfies Vj. |, = 0 (such an n exists, if nothing else n = N), we have 
R,,, < 0 so that indeed r(u) < T. 
Suppose next Vr < u (this situation corresponds to the dotted path of {R;} 
in Fig. II.1 with Rọ = u = u2). Then similarly 
Voz = rẹ Uy, < r® —U, = Ra, Vor, < Ro, 


On-1 as 


and so on. Hence R,, > 0 for all n < N, and since ruin can only occur at the 
times of claims, we have 7(u) > T. 
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A basic example is when {R+} is the risk reserve process corresponding to 
claims arriving at Poisson rate @ and being i.i.d. with distribution B, and a 
general premium rule p(r) when the reserve is r. Then the time reversibility of 
the Poisson process ensures that {A+} and {Aj} have the same distribution (for 
finite-dimensional distributions, the distinction between right- and left continu- 
ity is immaterial because the probability of a Poisson arrival at any fixed time 
t is zero). Thus we may think of {V,} as having compound Poisson input and 
being defined for all t < co. Historically, this represents a model for storage, say 
of water in a dam though other interpretations like the amount of goods stored 
are also possible. The arrival epochs correspond to rainfalls, and in between 
rainfalls water is released at rate p(r) when V; (the content) is r. We get: 


Corollary 2.2 Consider the compound Poisson risk model with a general pre- 
mium rule p(r). Then the storage process {V,} has a proper limit in distribution 
as t — œ, say V, if and only if y(u) < 1 for all u, and then 


plu) = P(V >u). (2.3) 


Proof. Let T — œ in (2.2). 


Consider now a compound Poisson risk model with constant premium rate 
1 and claim arrival rate 8. Then a direct relationship can be obtained between 
the survival probability in the risk model and the maximum workload Vmax(u) 
of a busy period in an M/G/1 queue with arrival rate 2. 


Theorem 2.3 Under the safety loading condition n > 0, we have for every 
u>o0 

1d 

eea] 2.4 
Fay s4), (2.4) 
where 6(u) = 1 — y(u) is the survival probability. 


P(Vinax > u) = 


Proof. The risk process R; starting in u can only survive if after each claim that 
occurs at some running maximum s > u, the level s will be reached again before 
ruin occurs. This is equivalent to the statement that the maximum workload 
Vmax of a busy period in an M/G/1 queue (with traffic intensity p < 1) does 
not exceed s. As we are only concerned about eventual survival, we can cut out 
every such ‘surviving’ excursion away from the running maximum of the risk 
process and only consider those claims at the running maximum which lead to a 
downward excursion causing ruin before recovering to level s. The instantaneous 
probability of having a claim of the latter type is 


Bat P(Vmax > 8) = 8 ds P(Vmax > 8), 
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since at the running maximum we have ds = dt. Consequently, the survival 
probability (u) can simply be interpreted as the probability to have zero events 
during [u, oo) of an inhomogeneous Poisson process with rate B(s) = @ P(Vinax > 
s) (which constitutes a thinning of the original Poisson process). This finally 
implies 


olu) = exp(- a B(s) ds) = ep(-8 f P(Vinax > s) ds). (2.5) 


The relation (2.4) also follows from a combination of (2.3) and the identity 


l1 d 
P max ==- logP 
(Vmax > u) ET ogP(V < u) 
for an M/G/1 queue (see for instance Cohen [249, p.618]), but the above proof 
establishes a direct and self-contained link between Vmax and ¢(u) that will also 
be useful later on (cf. VIII.4). 


Notes and references Two main references on storage processes are Harrison & 
Resnick [451] and Brockwell, Resnick & Tweedie [203]. Theorem 2.1 and its proof is 
from Asmussen & Schock Petersen [104], Corollary 2.2 from Harrison & Resnick [452]. 
The results can be viewed as special cases of Siegmund duality, see Siegmund [808]. 
Some further more general references are Asmussen [63] and Asmussen & Sigman [105]. 

Theorem 2.3 is due to Albrecher, Borst, Boxma & Resing [16]. 

Historically, the connection between risk theory and other applied probability areas 
appears first to have been noted by Prabhu [711] in a queueing context. It is a standard 
tool today, but one may feel that the interaction between the different areas was 
surprisingly limited in the first decades after the appearance of [711]. 


3 Random walks in discrete or continuous time 


Consider a random walk X,, = Xo + Yı +- -- + Yn in discrete time where the Y; 
are i.i.d., with common distribution F. 

For discrete time random walks, there is an analogue of Theorem 2.1 in terms 
of Lindley processes. For a given i.i.d. R-valued sequence Z1, Z2,..., the Lindley 
process Wo, W1, W2, . . . generated by Z1, Z2,... is defined by assigning Wp some 
arbitrary value > 0 and letting 


Wasi = (Wn + Zn)". (3.1) 


Thus {Wn }n=0,1,... evolves as a random walk with increments Z1, Z2, ... as long 
as the random walk only takes non-negative values, and is reset to 0 once the 
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random walk hits (—oo,0). I.e., {Wn}n=0,1,.. can be viewed as the reflected 
version of the random walk with increments Z1, Z2,... In particular, if Wo = 0 
then 

Wn = 244+°-:-+Zn —- _min pt + Sn) (3.2) 
(for a rigorous proof, just verify that the r.h.s. of (3.2) satisfies the same recursion 
as in (3.1)). 


Theorem 3.1 Let rT(u) = inf{n:u+Yi+---+Y, <0}. Let further N be 
fixed and let Wo, W1,..., Wyn be the Lindley process generated by Zi = —Yn, 
Z2 = —Yn-1,.--, ZN = —Y, according to Wo = 0. Then the events {r(u) < N} 
and {Wyn > u} coincide. 


Proof. By (3.2), 


Wn = -Yn--:'-N ge ay —-+++— Ywen41) 
= — _min eet +---+Yyv-n) = — min (Yit +Yn) 


Naas 


From this the result immediately follows. 


Corollary 3.2 The following assertions are equivalent: 

(a) y(u) = P(r(u) < 00) <1 for all u > 0; 

(b) y(u) = P(r(u) < œ) 30 as u > œ; 

(c) The Lindley process {Wyn} generated by Zı = —-Y1, Z2 = —Yo, ... has a 
proper limit W in distribution as n — œœ; 


(e) Yi +- + Yn Æ -œ a.s. 
In that case, W Zig and P(W > u) = P(—m > u) = y(u). 


Proof. Since (Yn, ..., Yi) has the same distribution as (Yi,..., Yn), the Lindley 
processes in Corollary 3.2 and Theorem 3.1 have the same distribution for n = 


0,1,..., N. Thus the assertion of Theorem 3.1 is equivalent to 
Z 
Wn = My = sup (Z+ + Zn) 
n=0,1,...,N 


so that Wy Zy- SuPp=0,1,... (Z1 + + Zn) = —m and P(W > u) = P(M > 
u) = (u). By Kolmogorov’s 0-1 law, either M = œ a.s. or M < œ a.s. 
Combining these facts gives easily the equivalence of (a)-(d). 

Clearly, (d) = (e). The converse follows from general random walk theory 
since it is standard that lim sup(Y1 +--+- + Yn) = co when Yı +---+Y, Æ —oo. 
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By the law of large numbers, a sufficient condition for (e) is that EY is well- 
defined and > 0. In general, the condition 


2 1 
Ss PY -+ Yn < 0) < œ 
mete 


is known to be necessary and sufficient ([APQ, p. 231]) but appears to be rather 
intractable. 


Remark 3.3 The i.i.d. assumption on the Z1,...,Zy (or, equivalently, on the 
Yi,..., Yn) in Theorem 3.1 is actually not necessary — the result is a sample 
path relation as is Theorem 2.1. Similarly, there is a more general version of 
Corollary 3.2. One then assumes Y,, to be a stationary sequence, w.l.o.g. doubly 
infinite (n = 0, +1, +2,...) and defines Zn = —Y_», 


Next consider change of measure via likelihood ratios. 


For a random walk, a Markovian change of measure as in Theorem 1.7 does 
not necessarily lead to a random walk: if, e.g., F has a strictly positive density 
and P, corresponds to a Markov chain such that the density of X1 given Xo = x 
is also strictly positive, then the restrictions of Pz, Py to Fn are equivalent (have 
the same null sets) so that the likelihood ratio L, exists. The following result 
gives the necessary and sufficient condition for {Ln} to define a new random 
walk: 


Proposition 3.4 Let {L,,} be a multiplicative functional of a random walk with 
isLn = 1 for all n and x. Then the change of measure in Theorem 1.7 corre- 
sponds to a new random walk if and only if 


P,-a.s. for some function h with . In that case, the changed incre- 


ih(Y) = 
ment distribution is F(a j= RY): Y < =. 


Proof. If (3.3) holds, then 
Ee [A = o [AY n(Y; 
= [Jena = I f AFA 


i=l i=l 
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from which the random walk property is immediate with the asserted form of 
F. Conversely, the random walk property implies E, f(Y,) = Eo f (Y1). Since 
Lı has the form g(Xo, Yı), this means E[g(x, Y)f(Y)] = E[g(0, Y)f(Y)] for all 
f and z, implying g(z,Y) = h(Y) as. where h(y) = g(0, y). In particular, (3.3) 
holds for n = 1. For n = 2, we get 


La = Ly(L1 0 61) = h(¥1)9(X1, Y2) = A(Vi)A(Y2), 


and so on for n = 3,4,.... 


A particular important example is exponential change of measure (h(y) = 
evy—"(2) where x(a) = log Fla] is the c.g.f. of F). The corresponding likelihood 
ratio is 

In = exp{a(¥i +--+ + Yn) —na(a)}. (3.4) 
Thus {Ln} is the Wald martingale, cf. II.1. We get: 


Corollary 3.5 Consider a random walk and an a such that 


aY 


kla) = log Fla] = logEe 


is finite, and define Ln by (3.4). Then the change of measure in Theorem 1.7 
corresponds to a new random walk with changed increment distribution 


F(x) = e0 / i e°% F(dy). 


Discrete time random walks have classical applications in queueing theory 
via the Lindley process representation of the waiting time, see Chapter VI. In 
risk theory, they arise as models for the reserve or claim surplus at a discrete 
sequence of instants, say the beginning of each month or year, or imbedded into 
continuous time processes, say by recording the reserve or claim surplus just 
before or just after claims (see Chapter VI for some fundamental examples). 
However, the tradition in the area is to use continuous time models. 


Now consider reflected versions of Lévy processes (cf. II.1). 


First assume in the setting of Section that {R+} is the risk reserve process for 
the compound Poisson risk model with constant premium rate p(r) = 1. Then 
the storage process {V;} has constant release rate 1, i.e. has upwards jumps 
governed by B at the epochs of a Poisson process with rate 3 and decreases 
linearly at rate 1 in between jumps. A different interpretation is as the workload 
or virtual waiting time process in an M/G/1 queue, defined as a system with 
a Single server working at a unit rate, having Poisson arrivals with rate @ and 
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distribution B of the service times of the arriving customers. Here ‘workload’ 
refers to the fact that we can interpret V; as the amount of time the server will 
have to work until the system is empty provided no new customers arrive; virtual 
waiting time refers to V; being the amount of time a customer would have to 
wait before starting service if he arrived at time t (this interpretation requires 
FIFO = First In First Out queueing discipline: the customers are served in the 
order of arrival). 


Corollary 3.6 In the compound Poisson risk model with constant premium rate 
p(r) = 1, y(u, T) = P(Vr > u), where Vr is the virtual waiting time at time 
T in an initially empty M/G/1 queue with the same arrival rate B and the 
service times having the same distribution B as the claims in the risk process. 


Furthermore, Vr A V for some r.v. V € [0, co], and y(u) = P(V > u). 


[The condition for V < œ a.s. is easily seen to be Bug < 1, cf. Chapter IV.] 

Processes with a more complicated path structure like Brownian motion or 
jump processes with unbounded Lévy measure are not covered by Section 2, and 
the reflected version is then defined by means of the abstract reflection operator 
as in (3.2), 


Wr=Xr- min X: 
os T o<t<T 


(assuming Wp = Xo = 0 for simplicity). 


Proposition 3.7 If {X;} is a Lévy process of the form Xı = Xo+put+oB,+ Mı 
as in II.(1.4), then 


rel (Xt Xo) = Agee = etro) (3.5) 
where oo 
nla) = ap a2o?/2+ f (e — tulda) (3.6) 


provided the Lévy measure of the jump part {M;} satisfies f$, |x| v(dx) < œœ. 


Proof. This is basically an easy application of formulas II.(1.7), II.(1.8). To 
repeat: by standard formulas for the normal distribution, 


2.2 
Teo (ut+o Bz) L et{auta o /2}. 


By explicit calculation, we show in the compound Poisson case (||v|| < co) in 
Proposition IV.1.1 that 


fet = exp { Í: (e = 1yu(ae)} 
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In the general case, use the representation as limit of compound Poisson pro- 
cesses. 


Note that (3.6) is the Lévy-Khinchine representation of the c.g.f. of an infinitely 
divisible distribution (see, e.g., Chung [246]). This is of course no coincidence 
since the distribution of X; — Xo is necessarily infinitely divisible when {X,} 
has stationary independent increments. 


Theorem 3.8 Assume that {X+} is a Lévy process with ff |x| v(dx) < œ, and 
that {L,} is a non-negative multiplicative functional of the ‘form Li = g(t, Xt — 
Xo) with E,L, = 1 for all x,t. Then the Markov process given by Theorem 1.7 
is again a Lévy process. In particular, if Li = e&(*#-*)-**® | then the changed 
parameters in the representation (1.4) are 


fi=ut0o7, G=07, (dr) =e? v(dz). 


Proof. For the first statement, we use the characterization (1.3) and get 


E[f(Xt4e = Xa) | F] = E E| f(X +s — Xi )Ls O Ot | Fa] 
= gece +s — Xi)g (s, Xt+s S Xt) | Fa] 
= ‘ot ( Xs)g (s, X s) = of (Xs) Ls 
= lof ( Xs). 
For the second, let e¥(% = Eo e?*1, Then 
ef (a) = o[Lie*"] =x (0) Zo [elo+ 9%] = e“ (a+0)—=r (0). 


K(a) = rk(a+0)- r0) 
= apt ((a +0) — 67)07/2+ / (ete — 99) v(dx) 
= a(u+0o7)+a tojas f (e°® — 1)e®®v(dz). 


Remark 3.9 If Xo = 0, then the martingale {e?*-'*)} is the continuous 
time analogue of the Wald martingale (3.4). 


Example 3.10 Let X; be the claim surplus process of a compound Poisson 
risk process with Poisson rate @ and claim size distribution B, corresponding to 
u = —1, o = 0, v(dx) = BB(dxz). Then we can write 


D(de) = pe®?B(dr) = BB(dx), where 8 = 6B[6], B(dr)= gg? 


54 CHAPTER I. FURTHER GENERAL TOOLS AND RESULTS 


Thus (since # = u = —1, © = o = 0) the changed process is the claim surplus 
process of another compound Poisson risk process with Poisson rate p and claim 
size distribution B. 


Example 3.11 For an example of a likelihood ratio not covered by Theorem 
3.8, let the given Markov process (specified by the P,) be the claim surplus 
process of a compound Poisson risk process with Poisson rate 8 and claim size 
distribution B, and let the P, refer to the claim surplus process of another 
compound Poisson risk process with Poisson rate 6 = 8 and claim size distribu- 
tion B # B. Recalling that o1,02,... are the arrival times and U1, U2,... the 
corresponding claim sizes, it is then easily seen that 


whenever the Radon-Nikodym derivative dB/dB exists (e.g. dB/dB = b/b when 
B, B have densities b,b with b(x) > 0 for all x such that b(x) > 0). 


4 Markov additive processes 


A Markov additive process, abbreviated as MAP in this section”, is defined as 
a bivariate Markov process {X+} = {(J:,5;)}, where {J+} is a Markov process 
with state space E (say) and the increments of {S;} are governed by {J+} in the 
sense that 


[f(St+s = SIli) | Fi] = Ex.o[f(Ss)9(Js)]- (4.1) 
For shorthand, we write P;, E; instead of P; o, E; o in the following. 

As for processes with stationary independent increments, the structure of 
MAP’s is completely understood when Æ is finite: 


In discrete time, a MAP is specified by the measure-valued matrix (kernel) 
F (dz) whose ijth element is the defective probability distribution F;; (dx) = 
Pi(Ji = j, Yı € dx) where Yp = Sn — S,-1. An alternative description is 
in terms of the transition matrix P = (pi;)ijem (here pi; = Pi(Ji = j)) 
and the probability measures 


Hy;(dz) = P(Y; € dz | Jo =i, Jı = j) = e 
ij 


2and only there; one reason is that in parts of the applied probability literature, MAP 
stands for the Markovian arrival process discussed below. 
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In simulation language, this means that the MAP can be simulated by first 
simulating the Markov chain {J„} and next the Y1, Y2,... by generating 
Yn according to Hi; when Jn-1 = t, Jn = J. 


If all Fj; are concentrated on (0,00), a MAP is the same as a semi-Markov 
or Markov renewal process, with the Y„ being interpreted as interarrival 
times. 


In continuous time (assuming D-paths), {J+} is specified by its intensity ma- 
trix A = (Aij) jeg. On an interval [t, t+ s) where J; = i, {S+} evolves like 
a Lévy process and the parameters ui, 07, vi(dx) in II.(1.4) depending on 
i. In addition, a jump of {J+} from i to j # i has probability qij of giving 
rise to a jump of {.5;} at the same time, the distribution of which has some 
distribution B,;. [That a process with this description is a MAP is obvi- 
ous; the converse requires a proof, which we omit and refer to Neveu [663] 
or Cinlar [247].] 


If E is infinite a MAP may be much more complicated. As an example, let 
{Jı} be standard Brownian motion on the line. Then a Markov additive 
process can be defined by letting 


1 t 
SaS lim f I(|Js| < €) ds 
10 0 


be the local time at 0 up to time t. 


As a generalization of the m.g.f., consider the matrix F; [a] with ijth element 
Ui [er; Jt = j] : 


Proposition 4.1 Fora MAP in discrete time and with E finite, Fn [a] = Fja]” 
where 


A nN nN 


Fla] = Filo] = (Eile; J, = Neves z (Filol); sen J (pij Hijlal) ; jer 


Proof. Conditioning upon (Jn, Sn) yields 


ules; yj) = Y Eiet; Jn = k] Erle”; J = j), 
kek 


which in matrix formulation is the same as Frsilal =F, [a] F(a]. 


Proposition 4.2 Let E be finite and consider a continuous time Markov addi- 
tive process with parameters A, mi, 07, Vi(dx) (i € E), qij, Bij (i,j € E) and 
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So = 0. Then the matrix Fla] with ijth element E; eest; Ji = jl is given by 
tK [a] 


e , where 


Kja] = A+ (K(Q)) ging + Aisgis(Bisla] — 1), 


KO(a) = aui +020?/24+ i, (e°? — 1)v;(da). 
Proof. Let fon be a Lévy process with parameters ui, 07, vi(dx). Then, up 
to o(h) terms, 


li [erat Jith = jl 
= (L+ Ajgh)E; [e**; J = j] eS 
+5 Akjh ù; ferst; J, = k| {1 — dey + drjBrsla]} 
kA 
= E; fert; Ji = j] (1 zi hr) (a)) 


+h 5 oj [ets Jy = k] {Any + Neg Mig (Bag lo] — 1)} 
ke 


(recall that qj; = 0). In matrix formulation, this means that 


Fina] = Fla] (I + h(K® (a)) iiag + hA + h(Xijqij (Bi [a] i 1)) 5 
al A 
Fila] = Filolk, 
which in conjunction with Fo[a] = I implies F;[a] = etle] according to the 
standard solution formula for systems of linear differential equations. 


In the following, assume that the Markov chain/process {J+} is ergodic. By 
Perron-Frobenius theory (see A.4c), we infer that in the discrete time case the 
matrix F[a] has a real eigenvalue x(a) with maximal absolute value and that 
in the continuous time case K[a] has a real eigenvalue «(a) with maximal real 
part. The corresponding left and right eigenvectors v), ho) may be chosen 
with strictly positive components. Since v(%, h® are only given up to a 
constants, we are free to impose two normalizations, and we shall take 


yoo Al = 1, nh® = 1, 


where m = v) is the stationary distribution. Then AO =e. 
The function x(a) plays in many respects the same role as the cumulant 
gf. of a random walk, as will be seen from the following results. In particular, 
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its derivatives are ‘asymptotic cumulants’, cf. Corollary 4.7, and appropriate 
generalizations of the Wald martingale (and the associated change of measure) 
can be defined in terms of x(a) (and h‘”), cf. Proposition 4.4. 


Corollary 4.3 E; [er**; J, = jl ~ R ee), 


Proof. By Perron-Frobenius theory (see A.4c). 


We also get an analogue of the Wald martingale for random walks: 


Proposition 4.4 njeh = Aetna), Furthermore, 


jeer ee bo 


is a martingale. 


Proof. For the first assertion, just note that 


er) = eT Blah = eKA = eTA = otra) pi, 


4 


It then follows that 


ijenik A | Fi] 


= eas tk(a) C AE 


4 


east tK(a)p 


Ce [acre a = gore tol, 


Let k® denote the derivative of h® w.r.t. a, and write k = k®. 


Corollary 4.5 E;,S; = tk’(0) +k; —Ejky, = tx’(0) +k; — elļle^tk. 


Proof. By differentiation in Proposition 4.4, 


[Sree A + eS KO] = et) (KL) 4 te’'(a)al”). (4.2) 


Let a = 0 and recall that h® = e so that pO) = nw =f 


The argument is slightly heuristic (e.g., the existence of exponential moments 
is assumed) but can be made rigorous by passing to characteristic functions. In 
the same way, one obtains a generalization of Wald’s identity ES, = Er - ES) 
for a random walk: 
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Corollary 4.6 For any stopping time T with finite mean, 


2S7 = K' (0) iT + ki — biky . 


H 


Corollary 4.7 No matter the initial distribution v of Jo, 


lim wie = (0): , lim —~— = «K"(0). 


t—=œ t t—oo t 


Proof. The first assertion is immediate by dividing by t in Corollary 4.5. For 
the second, we differentiate (4.2) to get 


y Ea $25 ,0%5*K12) 4 05! Ko] 
ef (a) (a + te! (a) KS 4 tf" (an 4 te! (ah de ia) kK }) l 


Multiplying by v;, summing and letting a = 0 yields 


wy [57 +25¢ky,] = tr (0)? + 2x! (0)vk + tK” (0) + O(1). 


Squaring in Corollary 4.5 yields 


E Si]? = t?«/(0)? + 2th’ (0)vk — 2tr'(0)E kj, + O(1). 


Since it is easily seen by an asymptotic independence argument that E, [Stky, 
= tk’ (O)ELky, + O(1), subtraction yields Var, S; = tk” (0) + O(1). 


Remark 4.8 Also for E being infinite (possibly uncountable), Ee®** typically 
grows asymptotically exponential with a rate x(a) independent of the initial 
condition (i.e., the distribution of Jo). More precisely, there is typically a func- 
tion h = h® on E and a (aq) such that 


aS,—tK(a) 


ie > h(x), too, 


for all x € E. From (4.1) one then (at least heuristically) obtains 


h(x) = lim ap COS” vk(a) 


UF 00. 


lim Ez [ee Ob 684-0 OM] 


US. 


= Eze). 


II 


It then follows as in the proof of Proposition 4.4 that 


h( Ji) aS,—tK(a 
eng Spt ( ie (4.3) 
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is a martingale. In view of this discussion, we take the martingale property 
as our basic condition below (though this is automatic in the finite case). An 
example beyond the finite case occurs for periodic risk processes in VII.6, where 
{Ji} is deterministic period motion on E = [0, 1) (ie., Je = (s +t) mod 1 P,-a.s. 
for s € E). 


Remark 4.9 The condition that (4.3) is a martingale can be expressed via the 
generator 2 (cf. II.4a) of {X+} = {(J+, S+) } as follows. Given a function h on 
E, let ha(i,s) = e*h(i). We then want to determine h and (a) such that 
zest h( J,) = e'( h(i). For t small, this leads to 


h(i) + thali, 0) = h(d(1+tK(a)), 


Ahali, 0) = K(a)h(i). 


We shall not exploit this approach systematically; see, however, VI.3b and Re- 
mark VII.6.5. 


Proposition 4.10 Let {(J:,5:)} be a MAP and let 6 be such that 


{Lthiso E { ene tes 


is a P,-martingale for each x € E. Then {L;} is a multiplicative functional, 
and the family {Phen given by Theorem 1.7 defines a new MAP. 


Proof. That {L;} is a multiplicative functional follows from 
Loh, = hh Jits) 0(Se4.—S1)—sn(8) , 
s h ; 


The proof that we have a MAP is contained in the proof of Theorem 4.11 below 
in the finite case. In the infinite case, one can directly verify that (4.1) holds 
for the P,. We omit the details. 


Theorem 4.11 Consider the irreducible case with E finite. Then the MAP in 
Proposition 4.10 is given by 


P= e OAT FA, H,; (da) = A 
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in the discrete time case, and by 


A = A7} KOA, — KOI, fis = ui t+00?, 6? = o? 


TD E Ox 
D (dz) = e8*v,(de), Gy = Balo) B; (dr) = ~— Bi; (da) 
1+ qi; (Bi; [6] — 1) Bi;(0] 


in the continuous time case. Here Aw) is the diagonal matrix with the h on 
the diagonal. In particular, if v;(dx) is compound Poisson, vi(dx) = 3;B;(dx) 
with B; < œo and B; a probability measure, then also ¥;(dx) is compound Poisson 
with , 

Bi] 


i 


Remark 4.12 The expression for A means 


ni - 


7 


In particular, this gives a direct verification that A isan intensity matrix: the off- 
diagonal elements are non-negative because A;; > 0, 0 < qij < 1 and B;;[0] > 0. 
That the rows sum to 1 follows from 


Ae = Ah KORO -kOe = KOAR” — K(A)e 
= K(0)e—K(O)e = 0. 


That 0 < gi; < 1 follows from the inequality 


qb 
— <1 <q<l A 
1440-0 Â} 0<q<1,0<b< œ 


Proof of Theorem 4.11. First note that the ijth element of F, [a] is 


sleet: J = i] = Eilet; J, =] = Ok OR fet: J, = j). 


In matrix notation, this means that 


Fija] = O Arh Filat Ape - (4.5) 
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Consider first the discrete time case. Here the stated formula for P follows 
immediately by letting t = 1, a = 0 in (4.5). Further 


F;;(da) = PY € dz, Jı = ĵ) = ills; Yı € dz, Jı = j| 
= ny” penp (Y E dg. J = j) = Ay J ba— n ) F, j(da). 
rO il Yı JL =) O 


This shows a Fy is absolutely continuous w.r.t. Fj; with a density propor- 

tional to e’*. Hence the same is true for Hi; and Hij; since Hiz, Hij are 

probability measures, it follows that indeed the normalizing constant is Ai [0]. 
Similarly, in continuous time (4.5) yields 


etK la] = Aj foe Flat 8 OD A co) ; 
By a general formula (A.13) for matrix-exponentials, this implies 


Kla] = Aza (Kla +0] — KO) DA, = Apt) Klat Age — (0). 


Letting a = 0 yields the stated expression for A. 
Now we can write 
Kia] — A = ÄT (Kja + 6] = K[0]) A, 
he T N 
diag t (o Aijtiz (Bijla + 0] — Bij l0) . 


= A+ (KO (a +0) — K®(0)) 


That K) (a +60) — K (0) corresponds to the stated parameters f;, 02, ;(da) of 
a Lévy process follows from Theorem 3.8. Finally note that by (4.4), 


ni?) x Ds A i = 
ROMER (Byla + 6] — Bylo) = 0 Mtii Bis l0] (Bijlo] — 1) 


n~ 


= isGij(Bizj[a] — 1). 


Notes and references The earliest paper on treatment of MAP’s in the present 
spirit we know of is Nagaev [654]. Much of the pioneering was done in the sixties in 
papers like Keilson & Wishart [524, 525, 526] and Miller [642, 643, 644] in discrete 
time; the literature on the continuous time case tends more to deal with special cases. 
Though the literature on MAP’s is extensive, there is, however, hardly a single com- 
prehensive treatment; an extensive bibliography on aspects of the theory can be found 
in Asmussen [58]. 
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Conditions for analogues of Corollary 4.3 for an infinite E are given by Ney & 
Nummelin [657]. For the Wald identity in Corollary 4.6, see also Fuh & Lai [380] and 
Moustakides [651]. The closest reference on exponential families of random walks on 
a Markov chain we know of within the more statistical oriented literature is Höglund 
[477], which, however, is slightly less general than the present setting. 


5 The ladder height distribution 


We consider the claim surplus process {S;} of a risk process with jumps Uj, 
interclaim times T; > 0 and premium rate 1 (but note that no independence 
or Poisson assumptions are made). As usual, T(w) = inf {t > 0: S, > u} is the 
time to ruin. In the particular case u = 0, write T} = 7(0) and define the 
associated ladder height S+, and ladder height distribution by 


G4(2) = P(S,, <2) = P(S 


Note that G is concentrated on (0,00), i.e. has no mass on (—oo,0], and is 
typically defective, 


FIGURE III.2 


The term ladder height is motivated from the shape of the process {M+} of 
relative maxima, see Fig. III.2. The first ladder step is precisely $,,, and the 
maximum M is the total height of the ladder, i.e. the sum of all the ladder steps 
(if 7 > 0, there are only finitely many). In Fig. III.2, the second ladder point is 
Sz, (2) where T;(2) is the time of the next relative maximum after 74(1) = T4, 
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the second ladder height (step) is S+, (2) — S+, (1) and so on. In simple cases like 
the compound Poisson model, the ladder heights are i.i.d., a fact which turns out 
to be extremely useful. In other cases like the Markovian environment model, 
they have a semi-Markov structure (but in complete generality, the dependence 
structure seems too complicated to be useful). In any case, at present we con- 
centrate on the first ladder height. The main result of this section is Theorem 
5.5 below, which gives an explicit expression for G} in a very general setting, 
where basically only stationarity is assumed. 

To illustrate the ideas, we shall first consider the compound Poisson model 
in the notation of Example II.3.2. Recall that B(x) = 1 — B(x) denotes the tail 
of B. 


Theorem 5.1 For the compound Poisson model with p = Bug < 1, Gy is 
given by the defective density g(x) = GB(x) = pbo(x) on (0,00). Here bo(x) = 
B(2)/uB. 


For the proof of Theorem 5.1, define the pre-T} -occupation measure R} by 


oO T+ 
R+(A) = | I(S; © A, T4 > t) dt = f I(S; € A)dt. 
0 


The interpretation of R,(A) is as the expected time {S;} spends in the set A 
before r}. Thus, R+ is concentrated on (—co,0], i.e., has no mass on (0,00). 
Also, by approximation with step functions, it follows that for g > 0 measurable, 


0 TH 
[survey = E [asian (5.1) 
0 


oo 
Lemma 5.2 R, is the restriction of the Lebesgue measure to (—oo, 0]. 


Proof. Let T be fixed and define Sf = Sr — Sr_4,0 < t < T. That is, 
{S} }o<i<r is constructed from {S;})<,<7 by time-reversion and hence, since 
the distribution of the Poisson process is invariant under time reversion, has the 
same distribution as {S;})<,<7, see Fig. HI.3. Thus, 


P(Sp € A, T4 > T) 
= P(SrEeA,S+<0,0<t<T) 
= P(S7 € A, SF < Si-n 0 <t<T) 
= P(S} €A, Sp < S% 0<t<T) 
= P(Sr €A, Sr < S0<t<T). (5.2) 
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FIGURE III.3(A): 74 >t 


FIGURE III.3(B): 74 <t 


Integrating w.r.t. dT, it follows that R(A) is the expected time when Sr 
is in A and at a minimum at the same time. But since S; — —co a.s., this is 
just the Lebesgue measure of A, cf. Fig. III.4 where the bold lines correspond 
to minimal values. 


Lemma 5.3 G} is the restriction of 3R4*B to (0,00). That is, for A C (0,00), 


y= af B B(A- y)Ry (dy). 
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FIGURE III.4 


Proof. A jump of {S+} at time t and of size U contributes to the event 
{Sa € A} precisely when T4 > t,U + S+- € A. The probability of this given 
{Su u< is B(A — Si- )I(T+ = t), and since the jump rate is 3, we get 


G4(A) 


f pa L| B(A — Si—); Ty > t] 
0 


II 


- ae f] as =o f o y)R, (dy) 


where g(y) = B(A — y) (here we used the fact that the probability of a jump at 
t is zero in the second step, and (5.1) in the last). 
Proof of Theorem 5.1. With ri(y) = I(y < 0) denoting the density of R4, 
Lemma 5.3 yields 


Si a (a — z) B(dz) = ef I(x < z) B(dz) = BB(z). 


Generalizing the set-up, we consider the claim surplus process {Sf}. of a 
risk reserve process in a very general set-up, assuming basically stationarity in 
time and space, 


* * 2 * 
{Sirs z S3 Teso = {S; Ji>o (5.3) 


for all s > 0. The sample path structure is assumed to be as for the compound 
Poisson case: {S/} is generated from interclaim times Tě and claim sizes U; 
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according to premium 1 per unit time, i.e. 


N? 
Sž =Ņ_Uğ-t where Nf = max {k =0,1,...: T% +- + TĚ <t}. 
k=1 


The first ladder epoch 7} is defined as inf {t > 0 : Si > 0} and the corresponding 
ladder height distribution is 


Gi(A) = P(St, € A) = P(S% € A,th < œœ). 


The traditional representation of the input sequence {(Z7,U;)},=1,,.. İS as a 
marked point process .@*, i.e. as a point process on [0, 00) x (0, 00). The points 
in the plane (marked by x on Fig. III.5) are (of, Uz) (k =1,2,...) where of = 
Ty +---+ TT, the first component representing time (the arrival time of) and 
the second the mark (the claim size U{). The marked point process .4* o 6, 
shifted by s is defined the obvious way, cf. Fig. III.5 (the points in the plane are 
(aŭ — s, U% ) for those k for which oj — s > 0). We call .@* stationary if M* 09, 
has the same distribution as .@* for all s > 0; obviously, this is equivalent to 
the risk process {57} being stationary in the sense of (5.3). In the stationary 
case, we define the arrival rate as 6 = E# {k : of € [0, h]} /h (by stationarity, 
this does not depend on h). 


M* 
al cee 
L 
| | íi : 
0 oi oğ 8 
M* obs 
U3 
| tL || l 


FIGURE III.5 


Given a stationary marked point process .@*, we define its Palm version M 
as a marked point process having the conditional distribution of .@* given an 
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arrival at time 0, i.e. of = 0. We represent .@ by the sequence (Tk, Uk)k=1,2,... 
where T; = 0, and let T = T, denote the first proper interarrival time. The two 
fundamental formulas connecting .#@* and M are 


(M) = ane S AM oba), (5.4) 
k: o% €[0,h] 
1 Te 
vel") = gre | AAob)dt, 
4 0 


where T is the first arrival time > 0 of æ and h > 0 an arbitrary constant (in 
the literature, most often one takes h = 1). As above, the r.h.s. of (5.4) does not 
depend on h; letting h | 0, Bh becomes the approximate probability P(o < h) 
of an arrival in [0,h] and the sum approximately p(-@*)I(o1 < h). This more 
or less gives a proof that indeed (5.4) represents the conditional distribution of 
M* given oł = 0. Note also that (again by stationarity) the Palm distribution 
also represents the conditional distribution of .@ o 6; given an arrival at time 
t. See, e.g., Sigman [812] or [APQ, VII.6] for these and further aspects of Palm 
theory. 


Example 5.4 Consider a finite Markov additive process (cf. Section 4) which 
has pure jump structure corresponding to p; = o? = 0, v;(dxz) = 3;B;(dz). 
Assume {J+} irreducible so that a stationary distribution m = (7;)iex exists. 

Interpreting jump times as arrival times and jump sizes as marks, we get a 
marked point process generated by Poisson arrivals at rate 3; and mark distri- 
bution B; when J = i, and by some additional arrivals which occur w.p. qij 
when {J,} jumps from i to j and have mark distribution B;j. A stationary 
marked point process .@ is obtained by assigning Jo distribution m. If J- = å, 
an arrival for .@ occurs before time t + dt w.p. 


drf p; a 5 Atis V 
j#i 
Thus the arrival rate for .@ is 
B= Y rih: E 5 dij}. 
icE j#i 


Given that an arrival occurs at time t, the probability a;; of A- = i, Je = j is 
Tibi/B for i = j and Tidijqizj/B for i # j. It follows that we can describe the 
Palm version ⁄ as follows. First choose (Jo_, Jo) w.p. ai; for (i, j) and let the 
initial mark U; have distribution B; when i = j and Bij otherwise. After that, 
let the arrivals and their marks be generated by {J+} starting from Jo = j. 
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Note in particular that the Palm distribution of the mark size (i.e., the 
distribution of U1) is the mixture 


B = Y fauBi +$ oyBa} = D g 8B + DS Ava Bu} - 


i€E jżi i€E jżi 


Theorem 5.5 Consider a general stationary claim surplus process {9% };>o, let 
Uo be a r.v. having the Palm distribution of the claim size and F(x) = P(Uo < 2) 
its distribution. Assume that Sf — —co a.s. and that p = BEUo < 1. Then the 
ladder height distribution G*_ is given by the (defective) density g% (x) = BF (x). 


Before giving the proof, we note: 


Corollary 5.6 Under the assumptions of Theorem 5.5, the ruin probability 
w* (0) with initial reserve u = 0 is p = BEUo. 


This follows by noting that 


y*(0) = IGI = [tac = 3 f Foa = BEV. 


By (5.4), 


v0) =E SO Ug 

k: a% € [0,1] 
here the r.h.s. has a very simple interpretation as the average amount of claims 
received per unit time. The result is notable by giving an explicit expression 
for ruin in great generality and by only depending on the parameters of the 
model through the arrival rate 8 and the average (in the Palm sense) claim size 
LUo. The last property is referred to as insensitivity in the applied probability 
literature. 


Proof of Theorem 5.5. A standard argument for stationary processes ([199, p. 
105]) shows that one can assume w.l.o.g. that .@* and ⁄ have doubly infinite 
time (i.e., are point processes on (—o0, 00) x (0,00)). We then represent M by 
the mark (claim size) Uo of the arrival at time 0, the arrival times 0 < 01 < 
o2 < ... in (0,00) and the arrival times 0 > g_1 > o_2 >... in (—co,0); the 
mark at time ox is denoted by Ux. 

Let p(t) be the conditional probability that St, € A,r, = t given the event 
A; that an arrival at t occurs. Then clearly 


Gi(A) = P(S% €A) = [os dt. 
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Consider a process {Š} >o Which makes an upwards jump at time —o_, 
(k = 1,2,...), moves down linearly at a unit rate in between jumps and starts 


from So = Uo. 

Now conditionally upon Az, {Sito << ÌS distributed as a process {5 } eel 
where a claim arrives at time t and has size Up, and the kth preceding claim 
arrives at time t — o_, and has size U_,. The sample path relation between 
{Sx} and {5u} amounts to u = S*—S#*_,,_ (left limit) when 0 < u < t and is 
illustrated on Fig. III.6. It follows that for A C (0,00) 


p(t) = [ee ees, 


ae oii 
t E€ A, Ši < 8-4, 0< 4 < t) 
t E€ A, Ši < Šu 0< U < t) 


where M= {š < Šu, 0<u< thi is the event that {šu } has a relative minimum 
at t. In Fig. III.6, time instants corresponding to such minimal values have been 
marked with i lines in the path of {S;}, and we let L(dy) be the random 


measure L(A = JT (Š: € A; Mi) dt. 


Since Šo = oo the support of L has right endpoint Up, and since by assump- 
tion Sf — —oo a.s., t > oo, the left endpoint of the support is —oo. A sample 
path inspection just as in the proof of Lemma 5.2 therefore immediately shows 
that L(dy) is Lebesgue measure on (—o0, Uo], cf. Fig. II.6 where the boxes on 
the time axis correspond to time intervals where {S} is at a minimum belong- 
ing to A and split A into pieces corresponding to segments where {Su} is at a 
relative minimum. Thus, 


G1 A) 


II 


af Pie AM) a = p 
0 


L(A) 
= oe f I(Uo > y)I(y € A) dy = 3 | Pwo> wey 


3 | Foa: 


II 
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rt 


FIGURE III.6 


Notes and references Theorem 5.5 is due to Schmidt & co-workers [102, 372, 
648] (a special case of the result appears in Proposition VII.2.1). A further relevant 
reference related to Corollary 5.6 is Bjork & Grandell [170]. 

Two alternative somewhat simpler approaches to prove Theorem 5.1 will be given 
in Chapter IV (after Theorem IV.2.1 and in Remark IV.3.6). 


Chapter IV 


The compound Poisson 
model 


We consider throughout this chapter a risk reserve process {R;},.9 in the ter- 
minology and notation of Chapter I, and assume that 7 


e {Ni}is0 is a Poisson process with rate 2. 


e the claim sizes U,,U2,... are i.i.d. with common distribution B, say, and 
independent of {N;}. 


e the premium rate is p= 1. 


Thus, {R;} and the associated claims surplus process {S;} are given by 


Ni Nz 
R = utt-) U, S% =u-R =} U-t 
i=1 i=l 


An important omission of the discussion in this chapter is the numerical 
evaluation of the ruin probability. Some possibilities are numerical Laplace 
transform inversion via Corollary 3.4 below, exact matrix-exponential solutions 
under the assumption that B is phase-type (see further IX.3), Panjer’s recursion 
(Corollary XVI.2.6) and simulation methods (Chapter XV). For finite horizon 
ruin probabilities, see Chapter V. 

It is worth mentioning that much of the analysis of this chapter can be 
carried over in a direct way to more general Lévy processes, see Chapter XI. 
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1 Introduction 


For later reference, we shall start by giving the basic formulas for moments, 
cumulants, m.g.f.’s etc. of the claim surplus S; = u — R. Write 


pg) =EU", pe =u6 =EU, p= Bue =1/(1+n). 


Proposition 1.1 (a) ES; = t(8us — 1) = t(p — 1); 
(b) Var S; = tbu; 

(c) Ee" = e'*(") where k(r) = B(Bir —1)-r; 
(d) The kth cumulant of S; is pp fork > 2. 


Proof. It was noted in Chapter I that p — 1 is the expected claim surplus per 
unit time, and this immediately yields (a). A more formal proof goes as follows: 


Nt Nz 
iS, = EU-t = E[S Uk | Ni] -t 
k=1 k=1 


= E[Mpup]—t = Ptup—t = t(p—1). 


The same method yields also the variance as 


Nt Nt Nt 
Var S; = Var 5 Uk = Var po Uk |x] + E Yar D Uk | N:] 
k=1 k=1 k=1 


= Var [Niue] +E[NVar U] = tbu% + t8VarU = tbu. 


For (c), we get 


eS Seer th Eer are UPON = k) 
k=0 
= ert SS Bir 3 o—2t Bt)" = exp{—rt — Bt+ Bir]et} = ets (r) 
T k! f 


Finally, for (d) just note that the kth cumulant of S; is tk®) (0), where «*) (0) 
is the kth derivative of « at 0, and that B® [0] = ul), 


The linear way the index t enters in the formulas in Proposition 1.1 is the 
same as if {S,} was a random walk indexed by t = 0,1,2,... The connections 
to random walks are in fact fundamental, and there are at least two ways to 
exploit this: 
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Recalling that ox is the time of the Ath claim, we have S,,—S,,_, = Uk— Tp, 
where Tẹ is the time between the kth and the (k — 1)th claim. Obviously, the 
Uy — Ty are i.i.d. so that {S,,} is a random walk with mean 


U -ET = EU 5-2 ae nNUB 
where 77 is the safety loading. In this way, we get a discrete time random walk 
imbedded in the claim surplus process {.S;}, which is often used in the literature 
for obtaining information about {S+} and the ruin probabilities. For example, 
obviously y(u) = P(max; So, > u). We return to this approach in Chapter VI. 
The point of view in the present chapter is, however, rather to view {S+} 
directly as a random walk in continuous time, meaning that the increments are 
stationary and independent, cf. III.3, so we have a Lévy process. Here is one 
immediate application: 


Proposition 1.2 (DRIFT AND OSCILLATION) 

(a) No matter the value of n, S;/t S p—1 as t > œ; 

b) Ifn <0, then S; “5 00; 

c) Ifn > 0, then S, *5 —oo; 

d) If =0, then liminfy.. St = —co, limsup;_,,, St = co. 


( 
( 
( 


For the proof, we need the following lemma: 
Lemma 1.3 Ifnh<t< (n+ 1)h, then 

Snrn-h < Si < Singin +R. 
Proof. We first note that for u,v > 0, 


Suto > Su — v. 
Indeed, Su+v — Su attains its minimal value when there are no arrivals in (u, u+ 
v], and the value is then precisely v. In particular, if t = nh+v with0 <v <h, 
then 

St > Snr —v > Srp — h. 


The inequality on the right in (1.3) is proved similarly. 


Proof of Proposition 1.2. For any fixed h, {Snn }n=0,1,... is a discrete time random 


jaie 


walk, and hence by the strong law of large numbers, Sna /n S ES), = h(p — 1). 
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Thus using Lemma 1.3, we get 


lim inf 2 = liminf inf St 
t=œœ t n—00 eG t 
1 —h 1 
d a a 


A similar argument for limsup proves (a), and (b), (c) are immediate con- 
sequences of (a). Part (d) follows by a (slightly more intricate) general ran- 
dom walk result ([APQ, pp. 224-225]) stating that liminfn—=oo Snh = —©, 
lim sup,,_.o5 Snh = œ (Lemma 1.3 is not needed for (d)). 


Corollary 1.4 The ruin probability y(u) is 1 for all u when ņ < 0, and < 1 
for all u when n > 0. 


Proof. The case of 7 < 0 is immediate since then M = oo by Proposition 1.2. 
If 7 > 0, it suffices to prove ~(0) = P(M > 0) < 1. However, if P(M >0)=1 
then {S;} upcrosses level 0 a.s. at least once. Considering the next downcrossing 
(which occurs w.p. 1 since S; — —oo) and repeating the argument, it is seen 
that upcrossing occurs at least twice, hence by induction i.o. This contradicts 
Si > —0o. 


There is also a central limit version of Proposition 1.2: 


Proposition 1.5 The limiting distribution of (Sı — t(p — 1))/vt as t > œ is 
normal with mean zero and variance Bu. 

Proof. Since {S;},55 is a Lévy process (a random walk in continuous time), 
{Snn},—0.1.... İS a discrete-time random walk for any h > 0, and hence it follows 


from standard central limit theory and the expression Var(S;) = tbu? (Propo- 
sition 1.1(b)) that the assertion holds as t — oo through values of the form 
t=0,h,2h.... The general case now follows either by another easy application 
of Lemma 1.3, or by a general result on discrete skeletons ([APQ, p. 415]). 


Remark 1.6 Often it is of interest to consider size-fluctuations, where the size 
of the portfolio at time t is M(t). Assuming that each risk generates claims at 
Poisson intensity @ and pays premium 1 per unit time, this case can be reduced 
to the compound o wo by an easy operational time transformation 
T~1+(t) where T(s) = 8 So M t)dt (this was already pointed out by Lundberg 
[614], see also a m an ee 


Notes and references All material of the present section is standard. 
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2 The Pollaczeck-Khinchine formula 


The time to ruin 7(u) was already defined in Chapter I as inf {t > 0: S; > u}, 
and we shall here exploit the decomposition of the maximum M as sum of ladder 
heights, cf. Fig. III.2. We assume throughout 7 > 0 or, equivalently, p < 1. 

It is crucial to note that for the compound Poisson model, the ladder heights 
are 1.i.d. This follows simply by noting that the process repeats itself after 
reaching a relative maximum. The decomposition of M as a sum of ladder 
heights now yields: 


Theorem 2.1 The distribution of M is (1— Gall) Soa, where G} is given 
n=0 

by the defective density g+(x) = GBB(x) = pbo(x) on (0,00). Here bo(x) = 

B(2)/yp. 


The formula for g} was already obtained in Theorem ITI.5.1, but before showing 
the rest of Theorem 2.1, we give an alternative argument which is short and 
intuitive, but also slightly heuristical: 


Proof of g(x) = GB(x): Assume B has a density b. Note that if there is a claim 
arrival before dt, then $,, € (u,u+du] occurs precisely when the claim has size 
u. Hence the contribution to g;(u) from this event is b(u)@dt. If there are no 
claim arrivals before dt, consider the process {Si h>0 where S, = Strat — Sat 
= Sipat +dt. For S,, € (u,u+du] to occur, S must either have its first ladder 
point equal to u + dt or v € (0,dt], and in the latter case the process starting 
from v must have its first ladder point equal to u + v, i.e. the probability is 


o g+(v)g+(u + v) dv. Collecting all first order terms, it follows that, 


g+(u) = b(u)Gdt + (1— Bdt)(g+(u + dt) + g4(0)g4(u) dt) + o(dt) 
= n nr (1 — bdt) (g+ (u) + g’, (u) dt + g+ (0)g+ (u) dt) + o(dt) 
= ) + dt(—6g+ (u) + g' (u) + Bg+(0)g+ (u) + Bb(u)) + o(dt), 
g (u) = ia g+(u) — Bb(u). (2.1) 


Integrating from 0 to x gives 
g+(£) = 94(0) + (8 — g+(0))P(S-, < 2,74 < œ) — BB(u). 


Letting x — co and assuming (heuristical but reasonable!) that then g(x) > 0, 
we get 


0 = 9+(0) + (B—94(0))P(t4 < œ) — 8 = —(6—9+(0))P(7+ = 00). 
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Since P(r} = œ) > 0 because of the assumption of a positive loading, we 
therefore have g,(0) = 6. Thus (2.1) simply means g/,(u) = —Gb(u), and the 
solution satisfying g+ (0) = 8 is gi(u) = GB(u). 


Proof of Theorem 2.1. The probability that M is attained in precisely n ladder 
steps and does not exceed x is Gi"(x)(1 — ||G ||) (the parenthesis gives the 
probability that there are no further ladder steps after the nth). Summing over 
n, the formula for the distribution of M follows. 


Alternatively, we may view the ladder heights as a terminating renewal pro- 
cess and M becomes then the lifetime. 

Combined with y(u) = P(M > u), Theorem 2.1 provides a representation 
formula for y(u), which we henceforth refer to as the Pollaczeck-Khinchine 
formula. Note that the integrated tail distribution Bp with density bo is familiar 
from renewal theory as the limiting stationary distribution of the overshoot 
(forward recurrence time), see [APQ, V.3-4] or A.le. Thus, we can rewrite the 
Pollaczeck-Khinchine formula as 


w(u) = P(M>u) = (1—p) X "By (u), (2.2) 


representing the distribution of M as a geometric compound. 

As a vehicle for computing y(u), (2.2) is not entirely satisfying because of the 
infinite sum of convolution powers, but we shall be able to extract substantial 
information from the formula, nevertheless. 

The following result generalizes the fact that the conditional distribution 
of the deficit S,(o) just after ruin given that ruin occurs (i.e., that 7(0) < 
oo) is Bo: taking y = 0 shows that the conditional distribution of the risk 
reserve immediately before ruin (i.e. —S;(9)_) is again Bo, and we further get 
information about the joint conditional distribution of this quantity and the 
deficit. Note that this distribution is the same as the limiting joint distribution 
of the age and excess life in a renewal process governed by B, cf. Theorem A1.5. 


Theorem 2.2 The joint distribution of (—S;(0)—, S70) is given by the follow- 
ing four equivalent statements: 


(a) P(—S,(0)- > #, So) > 46 7(0) <) = 8f Beaz; 
x+y 


(b) the joint distribution of (—S,(9)-,S7(0)) given T(0) < œ is the same as the 
distribution of (VW, (1 — V)W) where V,W are independent, V is uniform on 
(0,1) and W has distribution Fw given by dFw/dB(a2) = z/uB; 

(c) the marginal distribution of —S;(9)— is Bo, and the conditional distribution 


of S-(9) given —S,(9)— = y is the overshoot distribution Bw given by BY (z) = 
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Boly + 2)/Bo(y); 
(d) the marginal distribution of S,(9)— is Bo, and the conditional distribution of 


—S,(9)— given S,(9)— = Z is Be). 


The proof is given in V.2 and it gives an alternative derivation of the distribution 
of the deficit S,(9). 


Notes and references The Pollaczeck-Khinchine formula is standard in queueing 
theory, see for example [APQ], Feller [362] or Wolff [894]. The proof of Theorem 
III.5.1 is traditionally carried out for the imbedded discrete time random walk, where 
it requires slightly more calculation. As shown in Theorem III.5.5, the form of G4 is 
surprisingly insensitive to the form of {S+} and holds in a certain general marked point 
process set-up. However, in this setting there is no decomposition of M as a sum of 
ii.d. ladder heights so that the results do not appear too useful for estimating y(u) 
for u > 0. 

Theorem 2.2(a) is from Dufresne & Gerber [333]. Again, there is a general marked 
point process version, cf. Asmussen & Schmidt [103]. For the study of the joint distri- 
bution of the surplus S,,,,)_ just before ruin and the deficit S,(,) at ruin, see Schmidli 
[773] and references therein. In Chapter XII these results will be generalized in various 
directions. 

In risk theory literature, the Pollaczeck-Khinchine formula is often referred to as 
Beekman’s convolution formula, cf. Beekman [152, 153]. 


3 Special cases of the Pollaczeck-Khinchine for- 
mula 


The model and notation is the same as in the preceding sections. We assume 
n > 0 throughout. 


3a The ruin probability when the initial reserve is zero 


The case u = 0 is remarkable by giving a formula for y(u) which depends on 
the claim size distribution only through its mean: 

1 
1+ 


Corollary 3.1 ¥(0) = p = Bug = 


Proof Recall that r} = 7(0) and note that 


(C2 PS See 3 f Bojar = fie 
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Notes and references The fact that y(u) only depends on B through us is often 
referred to as an insensitivity property. As shown in III.6, the formula for w(0) holds 
in a more general setting; a slightly modified version also holds for certain two-sided 
jumps, cf. Section XII.4. A further relevant reference is Bjork & Grandell [170]. 


3b Exponential claims 


Corollary 3.2 If B is exponential with rate ô, then y(u) = pe &-9, 


Proof The distribution Bo of the ascending ladder height (given that it is 
defined) is the distribution of the overshoot of {S+} at time 74 over level 0. But 
claims are exponential, hence without memory, and so this overshoot has the 
same distribution as the claims themselves. I.e., Bo is exponential with rate 
ô and the result can now be proved from the Pollaczeck-Khinchine formula by 
elementary calculations. Thus, Bg” is the Erlang distribution with n phases 
and the density of M at x > 0 is 


a 


(k= de PRE ei (= plpde USP a a = Ble VE 


Integrating from u to oo, the result follows. Alternatively, use Laplace trans- 
forms. 

The result can, however, also be seen probabilistically without summing 
infinite series. Let A(x) be the failure rate of M at x > 0. For a failure at x, the 
current ladder step must terminate which occurs at rate 6 and there must be no 
further ones which occurs w.p. 1 — p. Thus A(x) = 6(1 — p) = ô — 6 so that the 
conditional distribution of M given M > 0 is exponential with rate 6 — 6 and 


vu) = P(M >u) = P(M>0)P(M>u|M>0) = p-e., 


In IX.3, we show that expressions for y(u) which are explicit (up to matrix 
exponentials) come out in a similar way also when B is phase-type. E.g. (Ex- 
ample IX.3.2), if 8 = 3 and B is a mixture of two exponential distributions with 
rates 3 and 7, and weights 1/2 for each, then 


plu) = —e*+ dou, (3.1) 


For heavy-tailed B, we use the Pollaczeck-Khinchine formula in Chapter X to 
show that 


w(u) ~ Tap Bolu u), u> œ. 
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Notes and references Corollary 3.2 is one of the main classical early results in 
the area. A variety of proofs are available. We mention in particular the following: 
(a) check that y(u) = p e~@-®)” is the solution of the renewal equation (3.2) below; 
(b) use stopped martingales, cf. II.3. 


3c Some classical SB ie results 


Recall the notation G4 (u) = f° G4 (dæ). 


Corollary 3.3 The ruin probability y(u) satisfies the defective renewal equation 


bs) = Gx) +G,*0lu) = B [ ave “sp(u— y)BB(y) dy. (3.2) 


Equivalently, the survival probability ¢(u) = 1 — y(u) satisfies the defective 
renewal equation 


lu) = 1-p+G, xu) = 1-p+ | ou-vBWdy. (33) 
Proof Write y(u) a 
P(M >u) = P(S,, > u,T, < 00) + P(M > u, Sr, < u, T4} < œ). 
Then the first term on the r.h.s. is G} (u), and conditioning upon S+, = y yields 
P(M > u, 8 


14 Í 


= [ Pata y)G (dy) = f vu- )G-+ (dy). 


u, T4 < œ) 


For the last identity in (3.2), just insert the explicit form of G4}. The case of 
(3.3) is similar (equivalently, (3.3) can be derived by elementary algebra from 


(3.2)). 


Corollary 3.4 The Laplace transform of the ruin probability is 


B= BB[-s] — ps | 
s(6 — s — BBI-s]) 


Proof. We first find the m.g.f. Bo of Bo as 


Blr] = [oma = i SS aie (3.5) 


> e "yp(u)du = (3.4) 
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Hence 
A aF 1—p (1—p)r 
rM = Ts nB, Ti — = , (3.6 
° per e = a eet a e 


which is the same as (3.4). 


Corollary 3.5 The first two moments of M are 


a (2) (3) 2,,(2)? 

5 PHB y PLB b uB 

M= | yplu)du = —2—, EM? = na Ga 
0 (u) 2(1 — p)ug 3(1—p)up 2(1—p)? en 


Proof. This can be shown, for example, by analytical manipulations (L’Hépital’s 
rule) from (3.6). We omit the details (see, e.g., [APQ, p. 237]). 


Remark 3.6 As mentioned in the Notes below, one can also derive (3.2) by 
analytical techniques. At the same time, (3.4) follows from (3.2) directly by 
taking Laplace transforms and noting that the Laplace transform of a convolu- 
tion of two functions is the product of their Laplace transforms. The Laplace 
transform of the survival probability ¢(u) correspondingly is 


G = ie u)du = yee 
= f odu = — 


This can now be used to provide yet another more analytical proof of the ladder 
height density for a compound Poisson process. From ¢(u) = P(M < u) (or 
from (3.5)) one sees that 


ana sM —_— P —su llu eS (1 — p)s 
'e = 40) + f ee es a 


On the other hand, as a sum of i.i.d. ladder heights, M is a geometric com- 
pound with E(e~*™) = E((G+[—s]/p)") and N is geometric(1 — p), leading to 
fe $M = (1—p)/(1—G*[-s]). A comparison of those two representations for 


[A ~ 


te °“ now gives G+[—s] = 8 (1 — B[-s])/s, so that g(x) = pbo(z). 


3. SPECIAL CASES OF POLLACZECK-KHINCHINE 81 


Notes and references Corollary 3.3 is standard, see e.g. [APQ, pp. 144-145] or 
Feller [362]. The approach there is to condition upon the first claim occurring at time 
t and having size x, which yields the survival probability as 


ove} utt 
alu) =f peat [ou +t 2) B(ae), 
0 0 
from which (3.3) can be derived by elementary but tedious manipulations (in Section 
XII.3 a formal procedure will be discussed that is applicable in much more general 
models). Of course, it is not surprising that such arguments are more cumbersome 
since the ladder height representation is not used. 

Also (3.6) and Corollary 3.5 can be found in virtually any queueing book. In fact, 
either of these sets of formulas are what many authors call the Pollaczeck-Khinchine 
formula. 

In view of (3.4), numerical inversion of the Laplace transform is one of the classical 
approaches for computing ruin probabilities, see e.g. Abate & Whitt [2], Embrechts, 
Griibel & Pitts [346], Griibel [438], Thorin & Wikstad [848] and Albrecher, Avram & 
Kortschak [14] (see also the Bibliographical Notes in [746, p. 191]). 


3d Deterministic claims 
Corollary 3.7 If B is degenerate at u, then 


Lu/u] 
7 —p(k-u/n) LOCK u/u)]" 
pa) = js E 


Proof. By replacing {S;} by {Stu/u} if necessary, we may assume pu = 1 so that 
the stated formula in terms of the survival probability ¢(u) = 1 — y(u) takes 
the form 


Lu] k 
olu) E alae u) 13 ai u)] ; (3.8) 


The renewal equation (3.3) for ae means 


p(u) 


II 


lAu 
1-64 | plu- ypy <1)dy 


II 


1-8+ f E ET 
u—l^u 


1-848 | _ Py) dy, 1<u< œ. 


82 CHAPTER IV. THE COMPOUND POISSON MODEL 


For 0 < u < 1, differentiation yields ¢’(u) = G¢(u) which together with the 
boundary condition $(0) = 1 — 8 yields ¢(u) = (1 — B)e® so that (3.8) follows 
for 0< u <1. Assume (3.8) shown for n — 1 < u < n and let (u) denote the 
r.h.s. of (3.8). For n < u < n+1, differentiation yields ¢’(u) = Go(u)—B¢(u-1), 


n k 
Hu) = go ayre seo ES =) 


n k 
= (1-8) Ge + a) pero BET 


sy S5 eoe- Pkw)" 
= O EPSE ee Ge ae 


= Bd(u)—Bd(u-1). 


Since (n) = (n) by the induction hypothesis, it follows that ¢(u) = (u) for 
n<u<nt+l. 


Notes and references Corollary 3.7 is identical to the formula for the M/D/1 
waiting time distribution derived by Erlang [356]. See also Iversen & Staalhagen [496 
for a discussion of computational aspects and further references. 


4 Change of measure via exponential families 


If X is a random variable with c.d.f. F and c.g.f. 


kla) = logEe** = log f e** F(da) = log Fal, 


the standard definition of the exponential family {Fg} generated by F is 
Fo(dx) = e&-* F(dz), (4.1) 
or equivalently, in terms of the c.g.f. of Fo, 
kola) = K(a+ 6) — K(6). (4.2) 


(Here @ is any such number such that «(@) is well-defined.) 
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The adaptation of this construction to Lévy processes (such as {.S;}) has been 
carried out in III.3, but will now be repeated for the sake of self-containedness. 
We could first tentatively consider the claim surplus X = S; for a single t, say 
t= 1: recall from Proposition 1.1 that «(@) = B(Bla] — 1) —a, and define kg by 
(4.2). The question then naturally arises whether «o is the c.g.f. corresponding 
to a compound Poisson risk process in the sense that for a suitable arrival 
intensity Gg and a suitable claim size distribution Bọ we have 


kola) = K(a+0)—K(0) = Bo(Bola] — 1) —a. (4.3) 
The answer is yes: inserting in (4.2) shows that the solution is 


eft Bla + 9 


Bo = BBl6],  Bo(dx) = FT Wis or equivalently Bola] = aa (4.4) 


Repeating for t 4 1, we just have to multiply (4.3) by t, and thus (4.4) works 
as well. Formalizing this for the purpose of studying the whole process {S;}, we 
set up 


Definition 4.1 Let P be the probability measure on D[0,co) governing a given 
compound Poisson risk process with arrival intensity 8 and claim size distribu- 
tion B, and define Bo, Bo by (4.4). Then Po denotes the probability measure 
governing the compound Poisson risk process with arrival intensity Bg and claim 
size distribution Bg; the corresponding expectation operator is Kg. 


The following result (Proposition 4.2, with T taking the role of n) is the 
analogue of the expression 


exp{O(a1 + +++ + £n) — nK(6)} (4.5) 


for the density of n i.i.d. replications from Fọ (replace x by x; in (4.1) and 
multiply from 1 to n). 

Let Fr = o(S; : t < T) denote the o-algebra spanned by the S+, t < T, and 
pir) the restriction of Pg to Fr. 


Proposition 4.2 For any fixed T, the pi) are mutually equivalent on Fr, and 


Z8 = exp{0Sr —Tr(6)}. 


That is, for G € Fr, 


P(G) = Po(G) = Eolexp{ — 0Sr + Tx(6)}; G]. (4.6) 
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Proof. We must prove that if Z is Ap-measurable, then 


Z = E[Ze%Sr—-Te)], (4.7) 


By standard measure theory, it suffices to consider the case where Z is mea- 
surable w.r.t. Fn) = o (SkT/n : k = 0,1,.. .,n) for a given n. But let Xk = 
SkT/n — S(k-1)T/n- Then the Xx are i.i.d. with common c.g.f. TK(a)/n, Z is 
measurable w.r.t. 0(X ,..., Xn), and thus (4.7) follows by discrete exponential 
family theory, in particular the expression (4.5) for the density. The identity 
(4.6) now follows by taking Z = e~95r+7*( 1(Q). 


Theorem 4.3 Let r be any stopping time and let G E€ ¥,, G C {r < œ}. 
Then 


P(G) = Po(G) = Eolexp{—0S, + rK(0)}; G]. (4.8) 
Proof. We first note that for any fixed t, 
age OF ttn) — 1, (4.9) 


Now assume first that G C {r < T} for some deterministic T. Then G E€ Fr, 
and hence (4.6) holds. Given .¥,, t = T — T is deterministic. Thus by (4.9), 


o [exp{—OSp + Tx(9)} (G) | F,)] = 1, 


so that PQ equals 
bg [exp{—0Sr + TK(8)} I(G) | F,)| 
= Eg [exp{—0S, + 7K(0)} 1(G)E¢ [exp {—0(Sr — S+) + (T — r)K(0)}| F- ] 
2o [exp{ —0S, + TK(0)} I(G)]. 

Now consider a general G. Then Gr = GN {r < T} satisfies Gr E€ F+, 
Gr C {r < T}. Thus, according to what has just been proved, (4.8) holds with 


G replaced by Gr. Letting T 7 co and using monotone convergence then shows 
that (4.8) holds for G as well. 


J 
a 


a 


S 
a 


5 Lundberg conjugation 


Being a c.g.f., (a) is a convex function of a. The behavior at zero is given by 
the first order Taylor expansion 


aN 
l+n 


kla) = K(0)+«'(0)a = 0+ESi:a = a(p—1) = 


Thus, subject to the basic assumption 7 > 0 of a positive safety loading, the 
typical shape of « is as in Fig. IV.1(a). 
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(a) K (a) (b) Kra) 


Y ay 


FIGURE IV.1 


When the tail of the claim size distribution is exponentially bounded, then 
typically a y > 0 satisfying 
0 = (7) = B(Bh)-1)-7 (5.1) 


exists. Equation (5.1) is known as the Lundberg equation and plays a funda- 
mental role in risk theory; an equivalent version illustrated in Fig. IV.2 is 


Bh) =14+2. (5.2) 
p 
A 
ra- ba 5 
P rA Bjs] 
oe 
7 s 
FIGURE IV.2 


As support for memory, we write Pz instead of P,, Gz instead of 8, and so on 
in the following. Note that 


ki(a) = bL(Brla]-1)-a = kla +y), 


cf. Fig. IV.1(b). An established terminology is to call y the adjustment coefficient 
but there are various alternatives around, e.g. the Lundberg exponent. 
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Example 5.1 Consider the case of exponential claims, B [r] = 6/(6 —r). It is 
then readily seen that the non-zero solution of (5.1) (or (5.2)) is y = 6—. Thus 
Bly] = 6/8, and (4.4) yields 8, = ô and that By, is again exponential with rate 
ôL = p. Thus, Lundberg conjugation corresponds to interchanging the rates of 
the interarrival times and the claim sizes. 


It is a crucial fact that when governed by Pz, the claim surplus process has 
positive drift 


1,51 = K,(0) = «'(y) > 0, (5.3) 


cf. Fig. IV.1(b). Taking T = r(u), G = {7(u) < co} in Theorem 4.3, we further 
note that (5.1) is precisely what is needed for one of the terms in the exponent 
to vanish so that Theorem 4.3 takes a particular simple form, 


y(u) = P(r(u) < œ) = Ez [exp {7S} ; T(u) < oo] f 


Letting €(u) = S;-(4) — u be the overshoot and noting that Pz(T(u) < œ) = 1 
by (5.3), we can rewrite this as 


y(u) = e Ere), (5.4) 


see also II.(1.5). 
Theorem 5.2 (LUNDBERG’S INEQUALITY) For all u > 0, y(u) < e7. 


Proof. Just note that €(u) > 0 in (5.4). 


Theorem 5.3 (THE CRAMER-LUNDBERG APPROXIMATION) y(u) ~ Ce~™ as 
u — co, where 


1- 1- 
= — A ae eal ey (5.5) 
y Jo ve’? GB(x) dx BB'|y] -1 
Proof. By renewal theory, see A.le, €(u) has a limit (00) (in the sense of weak 
convergence w.r.t. Pz) with density 


C 


1-GPa) _ Ga (a) 
Ta Ta 


where Gh is the Pz- ascending ladder height distribution and ph its mean. 


Since e777 is continuous and bounded, we therefore have E pes (u) — C where 


1 oO 
C = Epe) = Fg e77 (1 — GW (a) dz 
H4 v0 
1 œ Zaat 
= m f 0-0), (5.6) 


YH 
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and all that is needed to check is that (5.6) is the same as (5.5). To that end, 
take first 0 = y, T = T4, G {Sn € A} in Theorem 4.3. Then 
P(S- € A) = Ez [exp -yS } ; Sry € Al , 
which shows that 
GP (dx) = eG, (dr) = e 8B(x) de. (5.7) 


In principle, this solves the problem of evaluating (5.6), but some tedious (though 
elementary) calculations remain to bring the expressions on a final form. Noting 


that IeP] = 1 because of (5.3), we get 


[ a-eaep (ax) = 1- f sB(a)ae = l-p. 
0 0 


Using (5.7) yields 


uP = pf aBd = Boh), (5.8) 
where a ; 

pla) = if e°" B(x) dx = ~ (Bla] — 1) (5.9) 

0 

so that a a l 

pps eas s Bee 

' ye y 
(using (5.1)) and 

se) Spel ME 2 pRB Sa: (5.10) 


Example 5.4 Consider first the exponential case b(x) = de~®”. Then y(u) = 
pe 8)" where p = 3/5. From this it follows, of course, that y = ô — 6 (this 
was already found in Example 5.1 above) and that C = p. A direct proof of 
C = p is of course easy: 


Bh] = = = oe 
dyð=-y (8-7) 8 
1-p l-—p 1—p 
BB [yJ—-1 86/6? -1 1/p—1 
The accuracy of Lundberg’s inequality in the exponential case thus depends on 
how close p is to one, or equivalently of how close the safety loading 17 is to 


Zero. 
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Remark 5.5 Noting that 
pr-1 = brue, -1 = 6,0) = K'(7) = BB - 


we can rewrite the Cramér-Lundberg constant C in the nice symmetrical form 


= AO 1-p 
C= Gi eT (5.11) 


Remark 5.6 Let Y|- =f e e7 *“w(u) du denote the Laplace transform of the 
ruin probability. oe the o transform of w(u)e™ is then Y|- 

4]. Since from the damping property of Laplace transforms, for any function 
f(u), limysoo f(u) = lims—o 5 f|—s], given that this limit exists, we can also 
determine the constant C in the Cramér-Lundberg approximation by 


C= lim s[-s +4], 


which from (3.4) again gives (5.5). Although it looks tempting to use this 
procedure for determining C in more general models where y exists and only the 
Laplace transform of Yy may be available explicitly, it is important to note that 
this procedure does not prove the Cramér-Lundberg approximation, but just 
gives the correct value of C in case the approximation holds (the approximation 
itself usually has to be established by other techniques and often only exists in 
a weaker logarithmic sense, cf. Chapter XIII). 

For a related method to obtain the asymptotic behavior of y(u) for regularly 
varying claims, see Chapter X. 


In Chapter V, we shall need the following result which follows by a variant 
of the calculations in the proof of Theorem 5.3: 


Beals: 


Lemma 5.7 Fora #7, Epee) = —1 (1 B 
y-a 


ar (q) 


Proof. Replacing y by a in (5.6) and using (5.7), we obtain 


1 ee — 
ULC a (0) = l- f e0792 BB(a) de) 
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using integration by parts as in (3.5) in the last step. Inserting (5.10), the result 
follows. 


Notes and references The results of this section are classical, with Lundberg’s 
inequality being given first in Lundberg [615] and the Cramér-Lundberg approximation 
in Cramér [265]. Therefore, extensions and generalizations are main topics in the area 
of ruin probabilities, and in particular numerous such results can be found later in this 
book; in particular, see Sections V.4, V1.3, VII.3, VII.6, XI.2, and XII.2-3. 

The mathematical approach we have taken is more recent in risk theory (some of 
the classical ones can be found in the next subsection). The techniques are basically 
standard ones from sequential analysis, see for example Wald [869] and Siegmund [810]. 


5a Alternative proofs 


For the sake of completeness, we shall here give some classical proofs, first one 
of Lundberg’s inequality which is slightly longer but maybe also slightly more 
elementary: 


Alternative proof of Lundberg’s inequality. Let X be the value of {S;} just after 
the first claim, F(x) = P(X < x). Then, since X is the independent difference 
U — T between an interarrival time T and a claim U, 


a ; ee 5 p 
Fiy] = eVU-T) = eU .Ee™T = Bly|—— = 
h hiz 


> 


where the last equality follows from «(y) = 1. Let Y (u) denote the proba- 
bility of ruin after at most n claims. Conditioning upon the value x of X and 
considering the cases x > u and x < u separately yields 


vey) = Fat [ou 2) Flaa). 
We claim that this implies w(")(u) < e~?, which completes the proof since 
y(u) = limp oo Y™ (u). Indeed, this is obvious for n = 0 since 7) (u) = 0. 
Assuming it proved for n, we get 


wry) << Flu) + J e12) P(da) 


ee T e? F(da) + I e772) FP(dx) 


=09 


= ePi] = e. 
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Of further proofs of Lundberg’s inequality, we mention in particular the 
martingale approach, see III.1. 

Next consider the Cramér-Lundberg approximation. Here the most standard 
proof is via the renewal equation in Corollary 3.3 (however, as will be seen, the 
calculations needed to identify the constant C are precisely the same as above): 


Alternative proof of the Cramér-Lundberg’s approximation. Recall from Corol- 
lary 3.3 that 


y(u) = af” B(x) dx+ [ w(u— x)6B(x) dz. 
Multiplying by e™ and letting 
Z(u) = &"y(u), z(u) = erg f B(x)dz, F(dx) = e?” 6B(x)dz, 
we can rewrite this as 


Z(u) = z(u) + I eV) Y(u — x) - e” BB(a) da, 


(0) 


= z(u) + [ 2u-areer), 


Le. Z = z+ Fx» Z. Note that by (5.9) and the Lundberg equation, y is precisely 
the correct exponent which will ensure that F is a proper distribution (|| F|| = 1). 
It is then a matter of routine to verify the conditions of the key renewal theorem 
(Proposition A1.1) to conclude that Z(u) has the limit C = Ta z(x)dz/ur, so 
that it only remains to check that C reduces to the expression given above. 
However, up is immediately seen to be the same as uP calculated in (5.8), 
whereas 


~ z(u) du perdu f B(x) dx = 1 Bla)ae f Be™ du 


0 

bs l Bofe 1) dx = = (>(BbI 1) us] 
E Ee 
lB t 


using the Lundberg equation and the calculations in (5.9). Easy calculus now 
gives (5.5). 
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Notes and references Another related, but slightly different proof of the Cramér- 
Lundberg’s approximation in the spirit of Feller [362] that utilizes the Blackwell re- 
newal theorem can be found in Albrecher & Teugels [36]. 

The asymptotic behavior of the ruin probability for heavy-tailed claims will be 
discussed in X.2. 


6 Further topics related to the adjustment co- 
efficient 


6a On the existence of y 


In order that the adjustment coefficient y exists, it is of course necessary that 
B is light-tailed in the sense of I.2a, i.e. that Bla] < co for some a > 0. This 
excludes heavy-tailed distributions like the log-normal or Pareto, but may in 
many other cases not appear all that restrictive, and the following possibilities 
then occur: 


1. Bla] < œ for all a < ov. 


2. There exists a* < co such that Bla] < 00 for all a < a* and Bla] = œ 
for alla > a*. 


3. There exists a* < oo such that Bla] < 00 for all a < a* and Bla] = œ 
for alla > a*. 


In particular, monotone convergence yields B [a] T co as a J oo in case 1, and 
B [a] T co as a Î a* in case 2 (in exponential family theory, this is often referred 
to as the steep case). Thus the existence of y is automatic in cases 1, 2; standard 
examples are distributions with finite support or tail satisfying B(x) = o(e~°”) 
for all a in case 1, and phase-type or Gamma distributions in case 2. Case 3 
may be felt to be rather atypical, but some non-pathological examples exist, for 
example the inverse Gaussian distribution (see Example 9.7 below for details). In 
case 3, y exists provided Bla*] > 1+a*/G and not otherwise, that is, dependent 
on whether @ is larger or smaller than the threshold value a*/ (B [a*] — 1). 


Notes and references Ruin probabilities in case 3 with y non-existent are studied, 
e.g., by Borovkov [182, p. 132] and Embrechts & Veraverbeke [353]. To the present 
authors’ mind, this is a somewhat special situation and therefore not treated in this 
book. 
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6b Bounds and approximations for y 
Proposition 6.1 If the adjustment coefficient exists, it can be bounded by 


2(1 — uB) _ 2n4B 
yY< > 


Busy aS 


Proof. From U > 0 it follows that Bla] = Ece*Y > 1+ uga + uE o? /2. Hence 


3 apie a B(yEB sea, = pup + PEP. (6.1) 


from which the results immediately follows. 


The upper bound in Proposition 6.1 is also an approximation for small safety 
loadings (heavy traffic, cf. Section 7c): 


Proposition 6.2 Let B be fixed but assume that B = G(n) varies with the safety 


loading such that 6 = t———_.. Then as n | 0, 
uB(l +n) 
2NuB 
a O Meer yee (6.2) 
HB 


Further, the Cramér-Lundberg constant satisfies C = C (n) ~ 1. 


Proof. Since y(u) > 1 as ņ | 0, it follows from Lundberg’s inequality that y — 0. 
Hence by Taylor expansion, the inequality in (6.1) is also an approximation so 
that 


6(Bly] -1) me Blus +u /2) Shi Bru? 


1 = ; 
y 4 2 
ET 2(1-p) _ 2nup 
2 2 
T 


That C —> 1 easily follows from y —> 0 and C = Eye~%*(©) (in the limit, €(0o) 
is distributed as the overshoot corresponding to 7 = 0). For an alternative 
analytic proof, note that 


BB'h]-1 B'[y] -1/8 
a NLB E n 
uB +u — uell +n) uD /ue -n 
n 
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Obviously, the approximation (6.2) is easier to calculate than y itself. How- 
ever, it needs to be used with caution say in Lundberg’s inequality or the 
Cramér-Lundberg approximation, in particular when u is large. 


6c A refinement of Lundberg’s inequality 


The following result gives a sharpening of Lundberg’s inequality (because obvi- 
ously C', < 1) as well as a supplementary lower bound: 


Theorem 6.3 C-e% < y(u) < Cye~™ where 


B B(x) 
C_ = j f CO C = co y 
10 [> ev) B(dy)’ as J, e10- B(dy) 


Proof. Let H(dt,dx) be the Pz-distribution of the time 7(u) of ruin and the 
reserve u — S;(,)— just before ruin. Given T(u) = t, u — Sr(u)- = v, a claim 
occurs at time t and has distribution Bz (dy)/Br(x), y > x. Hence 


tre Ve) = f | Hara) a ge 
Bi (x) 


= H(dt, dz) ads Bly 


e77" Br (2) Bly] 


IA 
Q 
-s 
=~ 
ee 
T 
o 
8 
II 
Q 


The upper bound then follows from y(u) = e~7“Ez,e—§™, and the proof of the 
lower bound is similar. 


Example 6.4 If B(x) = e7?”, then an explicit calculation shows easily that 


B(x) e- oF B 


fr eW-OBdy) [@eFDO-Dée-udy E 


Hence C_ = C4 = p so that the bounds in Theorem 6.3 collapse and yield the 
exact expression pe for y(u). 


The following concluding example illustrates a variety of the topics discussed 
above (though from a general point of view the calculations are deceivingly 
simple: typically, y and other quantities will have to be calculated numerically). 
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Example 6.5 Assume as for (3.1) that 6 = 3 and 


1 1 
=. 3e 8% 4 — . Ze 


Wa) = 5 5 Te", 


and recall that the ruin probability is 


Since the dominant term is 24/35 - e7”, it follows immediately that y = 1 
and C = 24/35 = 0.686 (also, bounding e76” by e~“ confirms Lundberg’s 
inequality). For a direct verification, note that the Lundberg equation is 


a 1 3 1 7 
y = Bhi- = 3(5-3- + 5-7-1), 


which after some elementary algebra leads to the cubic equation 278 — 147? + 
12y = 0 with roots 0,1,6. Thus indeed y = 1 (6 is not in the domain of 
convergence of B[y] and therefore excluded). Further, 


1 1 1 1 2 
l-p = 1- = 1-3 . . = 
p PuB G 3+3 7) 7 
a 1 3 1 7 17 
B' = % = — 
hl 2 poa 2 (T-a)? a 30 
2 
1- 7 24 
C => 5, p = a = 35 ý 
BB þh] -1 3. L1 
36 
For Theorem 6.3, note that the function 
efa —3x 1 —Tx 
T {5-8 +5 7e ar O 343e% 
œ 1 1 = ats —4u 
i ev" {5 -3e7 3% 4 =. rete dx ea RE 
5 2 2 
attains its minimum C_ = 2/3 = 0.667 for u = oo and its maximum Cy = 


3/4 = 0.750 for u = 0, so that 0.667 < C < 0.750 in accordance with C = 0.686. 


Notes and references Theorem 6.3 is from Taylor [836]. Closely related results 
are given in a queueing setting in Kingman [535], Ross [748] and Rossberg & Siegel 
[749]. 
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Some further references on variants and extensions of Lundberg’s inequality are 
Kaas & Goovaerts [514], Willmot [886], Cai & Garrido [218], Dickson [306], Kalash- 
nikov [516, 518] and Chadjiconstantinidis & Politis [227], all of which also go into 
aspects of the heavy-tailed case. 


7 Various approximations for the ruin probabil- 
ity 
7a The Beekman-Bowers approximation 


The idea is to write y(u) as P(M > u), fit a gamma distribution with parameters 
A, ô to the distribution of M by matching the two first moments and use the 
incomplete gamma function approximation 


According to Corollary 3.5, this means that A, 6 are given by \/6 = a1, 2/6? = 
a2 


puss pu? p22” 
a — ae As a — ; 
* (=u ° — 3(—p)up | 20 =p? 


ie. ô = 2a; /a2, À = 2a?/az. 


Notes and references The approximation was introduced by Beekman [151], with 
the present version suggested by Bowers in the discussion of [151]. 


7b De Vylder’s approximation 


Given a risk process with parameters 86, B, p = 1, the idea is to approximate 
the ruin probability with the one for a different process with exponential claims, 
say with rate parameter 6, arrival intensity 8 and premium rate p. In order 
to make the processes look as much as possible alike, we make the first three 
cumulants match, which according to Proposition 1.1 means 


Bo 28 a) 68 _ 2 (3) 
=—p=Bup-l=p-1, = =fhyg, = = Bug’: 
5 j E ee 
These three equations have solutions 
(2) (2)3 (2)? 
~ 3 ~ 9 se. 3, 
j= Se joi, pa eB a. (7.1 
HB 2u 5 2uB 
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Letting G* = B / P, p* = p* / ô, the approximating risk process has ruin probabil- 
ity y(u) = pte -8")”, cf. Proposition I.1.3 and Corollary 3.2, and hence the 
ruin probability approximation is 


wu) x Le GBD, (7.2) 


Notes and references The approximation (7.2) was suggested by De Vylder [299]. 
Though of course it is based upon purely empirical grounds, numerical evidence (e.g. 
Grandell [429, pp. 19-24], [432]) shows that it may produce surprisingly good results, 
in particular for light-tailed claim distributions. Extensions of this method to approx- 
imations with more general involved claim distributions are immediate, but there is 
a natural trade-off between complexity and accuracy of the approximation. For an 
investigation on the use of Coxian distributions of order two for the claim distribution 
of the approximating risk process, see Badescu & Stanford [120]. Due to its simplicity, 
the De Vylder approximation is also very popular for the study of effects of external 
mechanisms such as dividend payments and reinsurance on the probability of ruin (see 
for instance Beveridge, Dickson & Wu [161], Gerber, Shiu & Smith [413]). 

A related procedure is to approximate y(u) by a combination of two exponential 
terms, where one of them is the Cramér-Lundberg approximation (5.5) and the coef- 
ficient and exponent of the other are determined by matching E[M] and the mass of 
M in 0. This leads to the so-called Tijms approximation, see [852] and Lin & Willmot 
(892, Ch.8]. 


yc The heavy traffic approximation 


The term heavy traffic comes from queueing theory, but has an obvious inter- 
pretation also in risk theory: on the average, the premiums exceed only slightly 
the expected claims. That is, heavy traffic conditions mean that the safety load- 
ing 7 is positive but small, or equivalently that 8 is only slightly smaller than 
Bmax = 1/up. Mathematically, we shall represent this situation with a limit 
where 8 } Bmax but B is fixed. 


Proposition 7.1 As 8 T Bmax, (Bmax — B)M converges in distribution to the 
2 2 
exponential distribution with rate 6 = TE, 
HB 
Proof. Note first that 1— p = (Bmax — B)uB. Letting Bo be the stationary excess 
life distribution, we have according to the Pollaczeck-Khinchine formula in the 
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form (3.6) that 


0°(Pmax—B)M 


2 l1-p = l-p 
1 — pBo[s(Bmax — 6)] 1—p+p{1— Bo[s(Gmax — B)]} 
zs 1l-—p Soa 
i 1l-—p- ps( Bmax = B) LB, q l-p-— s( Bmax E B) MB 
HB ô 


HB — SHBo ne 


where ô = UB /HB, = 2u} / u. 


Corollary 7.2 If 87 Bmax, u > œ in such a way that (Bmax — B)u —> v, then 
plu) => e7®. 


Proof. Write y(u) as PU Base — B)M > (Bmax — pju). 
These results suggest the approximation 
Wu) & 7 O(Bmax—B)U_ (7.3) 


It is worth noting that this is essentially the same as the approximation 


plu) x Ce x eo umn Hy (7.4) 


suggested by the Cramér-Lundberg approximation and Proposition 6.2. This 
follows since n = 1/p — 1 ~ 1 — p, and hence 


2u 1— 2nUB 
OCs — 8) = HB. P x nu 


However, obviously Corollary 7.2 provides the better mathematical foundation. 


Notes and references Heavy traffic limit theory for queues goes back to Kingman 
[534]. The present situation of Poisson arrivals is somewhat more elementary to deal 
with than the renewal case (see e.g. [APQ,X.7]). We return to heavy traffic from 
a different point of view (diffusion approximations) in Chapter V and give further 
references there. In the setting of risk theory, the first results of heavy traffic type 
seem to be due to Hadwiger [445]. 

Numerical evidence shows that the fit of (7.3) is reasonable for 7 being say 10-20% 
and u being small or moderate, while the approximation may be far off for large u. 
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7d The light traffic approximation 


As for heavy traffic, the term light traffic comes from queueing theory, but has 
an obvious interpretation also in risk theory: on the average, the premiums are 
much larger than the expected claims. That is, light traffic conditions mean 
that the safety loading 7 is positive and large, or equivalently that ĝ is small 
compared to ug. Mathematically, we shall represent this situation with a limit 
where 8 | 0 but B is fixed. 

Of course, in risk theory heavy traffic is most often argued to be the more 
typical case. However, light traffic is of some interest as a complement to heavy 
traffic, as well as it is needed for the interpolation approximation to be studied 
in the next subsection. 


Proposition 7.3 As G | 0, 


plu) x a f Bods = pE|[U -u; U >u] = pE(U — u)*. (7.5) 


Proof. According to the Pollaczeck-Khinchine formula, 


plu) = (1-p) X 6 ubBo u) ~ X BRB (u). 
n=1 n=1 
Asymptotically, X:+- = O(6?) so that only the first terms matters, and 


hence 


yu) ~ BupBolu) = B f Pis 


The alternative expressions in (7.5) follow by integration by parts. 


Note that heuristically the light traffic approximation in Proposition 7.3 is 
the same which comes out by saying that basically ruin can only occur at the 
time T of the first claim, i.e. y(u) ~ P(U — T > u). Indeed, by monotone 
convergence 


P(U -T >u) = f Batu as z af Baan 
0 u 


Notes and references Light traffic limit theory for queues was initiated by Bloom- 
field & Cox [178]. For a more comprehensive treatment, see Daley & Rolski [270, 271], 
Asmussen [61] and references therein. Again, the Poisson case is much easier than the 
renewal case. Another way to understand that the present analysis is much simpler 
than in these references is the fact that in the queueing setting light traffic theory 
is much easier for virtual waiting times (the probability of the conditioning event 
{M > 0} is explicit) than for actual waiting times, cf. Sigman [811]. Light traffic was 
first studied in risk theory in the first edition of this book. 


7. VARIOUS APPROXIMATIONS FOR THE RUIN PROBABILITY 99 


Ye Interpolating between light and heavy traffic 


We shall now outline an idea of how the heavy and light traffic approximations 
can be combined. The crude idea of interpolating between light and heavy traffic 
leads to 


2 


ee B- 
K (: p =) T z Bmax ak plu) 


me Pe Vs Bec Cem ae et 
E (: =) ae Bmax i Bmax G 


which is clearly useless. Instead, to get non-degenerate limits, we combine with 
our explicit knowledge of y(u) for the exponential claim size distribution E with 
the same mean up as the given one B, that is, with rate 1/up = Bmax. Let 


PR (u) denote the light traffic approximation given by Proposition 7.3 and use 
similar notation for %(®)(u) = (u), Y® (u) = pe- Omes—9™, i (u), Dre (u), 
ar (u). Substituting v = u(Gmax — 3), we see that the following limits exist: 

oo ` 
ai (==) Bi (1—8)v 
BT Bmax E) v pea e = cyrív) (say), 
max Pir (=) e E/E 
~(B 5 oa 
fog Bt (a=) L Doa Be) 
1 — 
pe v oe —Bmax& 
BLO (2) (z) Jor... da 


= Bme” [Blo de = erro) (say), 
v/ Bmax 


and the approximation we suggest is 


vw (f1- 


max 


2 


w(u) Jeur(u(Bmax Ss B)) FT cuT(u(lmax I 8))) 


p 
Bmax 
co 
p(l = p) Prax | B(x) dz + pre o(Bmax—B)u i (7.6) 
u(1—p) 
The particular features of this approximation are that it is exact for the expo- 
nential distribution and asymptotically correct both in light and heavy traffic. 
Thus, even if the safety loading is not very small, one may hope that some 
correction of the heavy traffic approximation has been obtained. 


II 


Notes and references In the queueing setting, the idea of interpolating between 
light and heavy traffic is due to Burman & Smith [209, 210]. Another main queueing 
paper is Whitt [882], where further references can be found. The adaptation to risk 
theory is new; no empirical study of the fit of (7.6) is, however, available. 
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8 Comparing the risks of different claim size dis- 
tributions 


Given two claim size distributions B®, B®), we may ask which one carries the 
larger risk in the sense of larger values of the ruin probability 7)(u) for a fixed 
value of (. 

To this end, we shall need various ordering properties of distributions, for 
more detail and background on which we refer to Miiller & Stoyan [653] or 
Shaked & Shantikumar [795]. 

Recall that B is said to be stochastically smaller than B®) (in symbols, 
BY xa BO) if BO(x) < B(x) for all x; equivalent characterizations are 
f faB™ < f fdB© for any non-decreasing function f, or the existence of ran- 
dom variables U“), U®) such that U™ has distribution B®, U@) distribution 
B®) and UM < UË) as. 

A weaker concept is increasing convex ordering: B® is said to be smaller 
than B®) in the increasing convex order (in symbols, B® <ie B®) if 


J Bows [Bay (8.1) 


for all z; an equivalent characterization is f fdB™ < f fdB@) for any non- 
decreasing convex function f. In the literature on risk theory, most often the 
term stop-loss ordering is used instead of increasing convex ordering because 
for a given distribution B, one can interpret S B(y)dy as the net stop-loss 
premium in a stop-loss or excess-of-loss reinsurance arrangement with retention 
limit x, cf. XVI.4. 

Finally, we have the convex ordering: B®) is said to be convexly smaller 
than B®) (in symbols, B® <,, B®) if f fdB™ < f fdB®) for any convex 
function f. Rather than measuring difference in size, this ordering measures 
difference in variability. In particular (consider the convex functions x and —2) 
the definition implies that BC) and B®) must have the same mean, whereas 
(consider xz?) B() has the larger variance. One can show that if B and B®) 
have the same mean and B®) xiex B, this is equivalent to BO) xex B®). 


Proposition 8.1 If B® <=, B®), then pb (u) < Yy® (u) for all u. 


Proof. According to the above characterization of stochastic ordering, we can 
assume that gs) < s® for all t. In terms of the time to ruin, this implies 
TY (u) > 7 (u) for all u so that {r® (u) < co} C {r@(u) <œ}. Taking 
probabilities, the proof is complete. 
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Of course, Proposition 8.1 is quite weak, and a particular deficit is that we 
cannot compare the risks of claim size distributions with the same mean: if 
B® xs B® and ugo) = ugo, then BO = B®). Here convex ordering is 
useful: 


Proposition 8.2 If BY <ix BO) and ugo) = ugo (i.e. BY 2x, BO), 
then WY (u) < Y® (u) for all u. 


Proof. Since the means are equal, say to u, we have 


BPa) = $f BO < 7 [Bway = BP. (8.2) 


Le., BY ~< st BP which implies the same order relation for all convolution 
powers. Hence by the Pollaczeck-Khinchine formula 


yDlu) = 1-A BP u) < (1—p) uB u) = p(w). 
n=1 n=1 


Remark 8.3 From the proof above it is clear that Y® (u) < y® (u) for all 
u still holds if the assumption on the ordering of the claim size distribution is 
weakened to just ask for (8.2). Slightly more general, the ordering defined by 

1 


HB® 


oo 1 co 
/ B® (y)dy < / B®)(y) dy for alla > 0, 
x HBO zx 


is known as the harmonic mean residual life order and is sufficient for y® (u) < 
y (u) to hold as long as Byppa) < Goppe. 


A general picture that emerges from these results and numerical studies like 
in Example 8.6 below is that (in a rough formulation) increased variation in B 
increases the risk (assuming that we fix the mean). The problem is to specify 
what ‘variation’ means. A first attempt would of course be to identify ‘variation’ 
with variance. The heavy traffic approximation (7.4) certainly supports this 
view: noting that, with fixed mean, larger variance is paramount to larger second 
moment, it is seen that asymptotically in heavy traffic larger claim size variance 
leads to larger ruin probabilities. Proposition 8.2 provides another instance of 
this, and here is one more result of the same flavor: 


Corollary 8.4 Let D refer to the distribution degenerate at ug. Then p?)(u) 
< yP) (u) for all u. 
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Proof. If f is convex, we have by Jensen’s inequality that Ef(U) > f(EU). This 
implies that D <,, B and we can apply Proposition 8.2. 


A partial converse to Proposition 8.2 is the following: 


Proposition 8.5 If Y® (u) < y® (u) for all u and B, then BY =, BO). 


Proof. Consider the light traffic approximation in Proposition 7.1. 


We finally give a numerical example illustrating how differences in the claim 
size distribution B may lead to very different ruin probabilities even if we fix 
the mean ps = up. 


Example 8.6 Fix 6 at 1/1.1 and upg at 1 so that the safety loading 7 is 10%, 
and consider the following claim size distributions: 


Bı: the standard exponential distribution with density e~*; 


d 


Bə: the hyperexponential distribution with density 0.1A;e7*!* + 0.9\2e7*2” 
where A, = 0.1358, Ag = 3.4142; 


B3: the Erlang distribution with density 4%e72*; 


By: the Pareto distribution with density 3/(1 + 2x)>/?. 


Let ua denote the a fractile of the ruin function, i.e. (ua) = a, and consider 
a = 5%, 1%, 0.1%, 0.01%. One then obtains the following table: 


Bı B2 B; Ba 
U0.05 32 181 24 35 
U0.01 50 282 37 70 
U0.001 75 425 56 245 
U0.0001 100 568 74 1100 


(the table was produced using simulation and the numbers are therefore subject 
to statistical uncertainty). Note to make the figures comparable, all distributions 
have mean 1. In terms of variances oĉ, we have 
a= 5 < o? 21 eee S10 < a= Ge 

so that in this sense B4 is the most variable. However, in comparison to B2 the 
effect on the ua does not show before a = 0.01%, which appears to be smaller 
than the range of interest in insurance risk (certainly not in queueing appli- 
cations!), and this is presumably a consequence of a heavier tail rather than 
larger variance. For B1, B2, B3 the comparison is as expected from the intuition 
concerning the variability of these distributions, with the hyperexponential dis- 
tribution being more variable than the exponential distribution and the Erlang 
distribution less. 
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Notes and references Further relevant references are Goovaerts et al. [425], 
van Heerwarden [454], Kliippelberg [539], Pellerey [689] and (for the convex order- 
ing) Makowski [623]. For the harmonic mean residual life order, see Michel [636] and 
Trufin, Albrecher & Denuit [854]. For relations between higher-order stop-loss order- 
ings of claim size distributions and ruin probabilities see Cheng & Pai [236]. Tsai [856] 
considers orderings in the presence of perturbations. We return to ordering of ruin 
probabilities in a special problem in VII.4 and also in XIII.8. 

For the situation that the claim size distribution and the Poisson parameter are 
unknown, but a sample of data points is available, Politis [709] considers the problem 
of semi-parametric estimation of ruin probabilities. 


9 Sensitivity estimates 


In a broad setting, sensitivity analysis (or pertubation analysis) deals with the 
calculation of the derivative (the gradient in higher dimensions) of a performance 
measure s(0) of a stochastic or deterministic system, the behavior of which is 
governed by a parameter 0. A standard example from queueing theory is a 
queueing network, with 0 the vector of service rates at different nodes and 
routing probabilities, and s(@) the expected sojourn time of a customer in the 
network. In the present setting, s(@) is of course the ruin probability y = y(u) 
(with u fixed) and @ a set of parameters determining the arrival rate 8, the 
premium rate p and the claim size distribution B. For example, we may be 
interested in Ow /Op for assessing the effects of a small change in the premium, 
or we may be interested in 0w/0G as a measure of the uncertainty on w if 8 is 
only approximatively known, say estimated from data. 


Example 9.1 Consider the case of claims which are exponential with rate 6 
(the premium rate is one). Then ~ = Be OB), and hence 


Ob _ L-e 4 BU.-o-ayu = (1 
a6 ~ 5° te = Bi" y(u), 
which is of the order of magnitude uy(u) for large u. 

Assume for example that 6 is known, while 8 = 8 is an estimate, obtained 
say in the natural way as the empirical arrival rate N;,/t in [0,t]. Then if t 
is large, the distribution of 8 — 8 is approximatively normal N (0, 8/t). Thus, 
ify = Ee O-B)u, it follows that % — 4% is approximatively normal N(0,0?/t), 


where : 
o = o (2) ~ Buy. 
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In particular, the standard deviation on the normalized estimate w/ w (the rel- 
ative error) is approximatively 31/?u, i.e. increasing in u. Similar conclusions 
will be found below. 


Proposition 9.2 Consider a risk process {R;} with a general constant premium 
rate p. Then 


where the partial derivatives are evaluated at p = 1. 


Proof. This is an easy time transformation argument in a similar way as in 
Proposition 1.1.3. Let RP = R,jp. Then the arrival rate 3) for {RP} is 3/p, 
and hence the effect of changing p from 1 to 1+ Ap corresponds to changing 3 
to B/(1+ Ap)  B(1 — Ap). Thus at p= 1, 

Ow OB OY _ 


ap opoB 8 


Op 
ap 


As a consequence, it suffices to fix the premium at p = 1 and consider only 
the effects of changing 3 or/and B. In the case of the claim size distribution 
B, various parametric families of claim size distributions could be considered, 
but we shall concentrate on a special structure covering a number of important 
cases, namely that of a two-parameter exponential family of the form 


Bo c(dx) = exp{Ox + ¢t(x) —w(0,¢)}u(dz), «>0 (9.1) 


(see Remark 9.5 below for some discussion of this assumption). 

Consider first the adjustment coefficient y as function of 6, 0, C, and write 
yg = 07/0 and so on. Similar notation for partial derivatives are used below, 
e.g. for the ruin probabilities y = y(u) and the Cramér-Lundberg constant C. 


Proposition 9.3 


_ Y 
w= iraa r oe) 


_ (6 ab. y) [wo(0 Six AG) — we (0, ¢)] 
an 1-(6+y)wo(O+7,0) ee) 
= (8 ug y) [we (0 TY, Ç) = Wç (0, 6) 
Ms 1= (+y ty 


(9.4) 
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Proof. According to (9.8) below, we can rewrite the Lundberg equation as 
w(0 +7, C) — w(8,¢) = log(1+ 7/6). Differentiating w.r.t. 3 yields 


1 


From this (9.2) follows by straightforward algebra, and the proofs of (9.3), (9.4) 
are similar. 


Now consider the ruin probability Y% = y(u) itself. Of course, we cannot 
expect in general to find explicit expressions like in Example 9.1 or Proposition 
9.3, but must look for approximations for the sensitivities wg, Yo, Wc. The 
most intuitive approach is to rely on the accuracy of the Cramér-Lundberg 
approximation, so that heuristically we obtain 

0 =N myu "7 
ve x ape ™ = Cee ™—uygCe ™% x —uypy (9.5) 


as u — oo. As will be seen below, this intuition is indeed correct. However, 
mathematically a proof is needed basically to show that two limits (u — oo and 
the differentiation as limit of finite differences) are interchangeable. 

Consider first the case of 3y /32: 


Proposition 9.4 As u — oo, it holds that 


Proof. We shall use the renewal equation (3.2) for y(u), 


w(u) = Bf Beart f Wu- 2)9BC) dz. (9.6) 


Letting y = ðy/ðß and differentiating (9.6), we get 


plu) = I B(x)dz + J w(u— x) B(x) dx + 1 plu — x) 3B(x) dz. 
u 0 0 
Proceeding in a similar way as in the proof of the Cramér-Lundberg approxima- 


tion based upon (9.6) (Section 5), we multiply by e7™ and let Z(u) = e™(u), 
F (dx) = e” GB(ax)dx and z = z1 + z2, where 


z(u) = e™ ips B(x)dz, z(u) = e™ [ we- 2B) dz. 
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Then Z = z + F x Z and F is a proper probability distribution. By dominated 
convergence, 


alu) = 2 [Apur ma ie re 
aw = 5 f plu — 2) F(a) 7i C F(de) 


B’ 
as u — 00, and also z1(u) — 0 because of Bly] < co. Hence by a variant of 
the key renewal theorem (Proposition A1.2 of the Appendix), Z(u)/u > C/ßuF 


where upr is the mean of F. But from the proof of Theorem 5.3 (see in particular 
(5.10)), we = (1 — p)/Cy. Combining these estimates, the proof is complete. 


For the following, we note the formulas 


a ¿t(U) = we(8, ¢), (9.7) 
ceU = Bocla] = exp{w(0 +a, t) —w(6, 0}, (9.8) 
2o ct(U)e®Y = w(6+a,¢) exp{w(0 + a, ¢) —w(6,¢)} (9.9) 


which are well-known and easy to show (see e.g. Barndorff-Nielsen [136]). Fur- 
ther write 


do = [wo(8+7,¢) — wol, )] exp{w(9 + 7,6) — w(8,¢)} 
= [iol + 7.6) -l0 O (1+ 3), 

de = [we(8+7,6) —we(9,C)Jexp{w(9 + 7,6) — oO, 6) } 
= [ucO+74¢) — wel] (1+ 5): 


Proposition 9.5 Assume that (9.1) holds. Then as u > oo, 


Ob au BC7dg OH vy BOP de 
ue ; ue ——. 
00 1l—p at 1l—-p 


(9.10) 


Proof. By straightforward differentiation, 


dB(x) 
oç 


= xf exp{ Oy + Ct(y) — w(9,¢) } (dy) 


[leer —2<00,0)) Bay. 


Letting y = Ow/0C¢, it thus follows from (9.6) that 


II 


glu) = e ™z(u) +e z(u) + 3 y(u — x)6 B(x) dz, 
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where 


II 


ZL (u) 


per f f w y) — we(0, 0] B(dy) de, 
e f gua f(t) ~ 4¢(0,0)] Baaz. 


Multiplying by e7” and letting 


z2(u) 


Z(u) =e™y(u), z=2z1 +22, F(dx) =e 8B(x)dx 


this implies Z = z+ F « Z. By dominated convergence and (9.7)-(9.9), 


Co CO 


z(u) > Ce Bf [t(y) — wcl, ¢)] B(dy) dx 


= pC l “Te —e(6,0)] tew = 1) B(Ay) 


as u — oo, and also 21(u) — 0 because of 


| E — 0e(8,0)] Bay) < 00. 


Hence, 


Zu) _, Cy 


from which the second assertion of (9.10) follows, and the proof of the first one 
is similar. 


Example 9.6 Consider the gamma density 
1 
b(z) = ——~a% le = exp{—dx + alog x — (logI'(a) — alog ô) } - = 
Here (9.1) holds with 
u(dx) =a 'da, 0=-—6, C=a, t(x) = logz, 


w(0, C) = logT (a) — alog 6 = logT(¢) — ¢log(—90). 
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We get we(0,¢) = U(¢)—log(—0) = U(a)—log 6 where Y =T" /T is the Digamma 
function, w9(9,¢) = —¢/0 = a/ô. It follows after some elementary calculus that 


p = a3/6 and, by inserting in the above formulas, that 


2 Boe 
C = apro 
827! 
Cae R N eT 
ô ô \@ 
Bede ae (g=5) s 
Z y -= ôy 
YB E 2 ’ 
ap? + apy + By — pò 
oa fh. ok aby + ary? 
Ore wg e 62 — ôy — a bô — ady’ 
(86 + dy - By - 7”) ô 
a = = l . 
WEK ô— y-ap -ay Belge) 
Finally, (9.10) takes the form 
y an -yuBC?dg 3p _ OW -yu BC? de 
= ~ —ue f = ~ ue 
06 00 1l—p Oa oç 1—p 


Example 9.7 Consider the inverse Gaussian density 


saz {se (Ete) 


be,c(Z) = 


This has the form (9.1) with 


1 E e 1 
u(dx) = oo Ianga s=-7 t(x) = an 


W(9,0) = ~€e-loge = —2/(-A)(—G - 5 log(-¢) ~ 5 log2. 


In particular, for a < a* = E 


Bocla] = exp{w(0 + a,¢) -wl0, C} = expfe(é- VE = 2a)}. 


(9.11) 
(9.12) 
(9.13) 


(9.14) 
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Thus the condition Bla*] > 1+ a*/ß of Section 6a needed for the existence of 
y becomes e§¢ > 1 + €7/2. Straightforward but tedious calculations, which we 
omit in part, further yield 


BBschl-1 = Berpet- VF —2)}- os 1 


pay 


cv E? — 2y 


= 0 a 1 = C c 
wel0,) = 2 t= 8h, weg = $= 8 


2 & -27 
a ae = Be(B-+7) 
é- VP=% 
= = — + ; 
ee pe EVE PH 
Ve? = 27 — ey + B) 


TEn 


de = pem 


Finally, (9.10) takes the form 


Ye = CVE = 


Op av yu BC?do = æ ru BCP de 
a T Eoo Sep e ae oe 8 > E 


Remark 9.8 The specific form of (9.1) is motivated as follows. In general, 
the exponent of the density in an exponential family has the form 0)t)(x) + 

- + Oxt,(x). Thus, we have assumed k = 2 and t)(x) = x. That it is no 
restriction to assume one of the t;(x) to be linear follows since the whole set-up 
requires exponential moments to be finite (thus we can always extend the family 
if necessary by adding a term 0x). That it is no restriction to assume k < 2 
follows since if k > 2, we can just fix k — 2 of the parameters. Finally if k = 1, 
the exponent is either 0x, in which case we can just let t(x) = 0, or ¢t(x), in 
which case the extension just described applies. 
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Notes and references The general area of sensitivity analysis (gradient estima- 
tion) is currently receiving considerable interest in queueing theory. However, the 
models there (e.g. queueing networks) are typically much more complicated than the 
one considered here, and hence explicit or asymptotic estimates are in general not 
possible. Thus, the main tool is simulation, for which we refer to XV.7 and references 
therein. 

Comparatively less work seems to have been done in risk theory; thus, to our 
knowledge, the results presented here are new. Van Wouve et al. [861] consider a 
special problem related to reinsurance. For the study of perturbation via perturbed 
renewal equations, see Gyllenberg & Silvestrov [444]. 


10 Estimation of the adjustment coefficient 


We consider a non-parametric set-up where 3, B are assumed to be completely 
unknown, and we estimate y by means of the empirical solution yr to the 
Lundberg equation. To this end, let 


N. 5 ieee B 
mage PA ge Dae i a oe 


and let yr be defined by kr(yr) = 0. 


~ 


Note that if Nr = 0, then Br and hence yr is undefined. Also, if 
1 
PT = bry Ut + Une) > 1, 
T 


then yr < 0. However, by the LLN both P(Nr = 0) and P(pr > 1) converge to 
0 as T — œ. 


Theorem 10.1 As T — œ, yr & y. If furthermore B24] < œ, then 
l 2 
yr-y *& N(0, 503), 
where o2 = BK(27)/K'(y)?. 


For the proof, we need a lemma. 


Lemma 10.2 As T > œ, 


Bro) ~ (Bh), SA), (10.1) 
kr) & n (S20) (10.2) 
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Proof. Since 


Var(e) = Ee?” — (Ee)? = Bl2] - Bir, 


we have pi pi 
Le ~ , B[2y] — Biy)? 
Don eS n (Bp) A BOF) 
j= 1 


Hence (10.1) follows from Nr/T “3 8 and Anscombe’s theorem. More generally, 
since Nr/T ~ N(G, G/T), it is easy to see that we can write 


(i) = (ah) 38 ( eae 
a Bol © vT \ vB BI - Bri? va 
where V1, V2 are independent N(0, 1) r.v.’s. Hence 


kr(y) = (8+ (br -8))((Brly] — Bh) + (Bh) -1)) -7 
1) — y+ (6r — B)(Bly] - 1) + B(Brly] - Bh) 


~ p(B 

= 0+ {VAh -1)v + VAB- Br - V4} 
2 (0,2 {Bp - 1)? + Ber - 8h}}) 

= n(E{ Be 3 it) 


which is the same as (10.2). 


Proof of Theorem 10.1. By the law of large numbers, 


a.s. 


br = B, Br[a] = Bjal], kra) = kla). 


Let 0<¢€< y Then 
K(y—€) <0 < K(y+ 6) 


and hence 
Kr(y—€) <0 <Kr(y+6) 


for all sufficiently large T. Le., yr € (y —€,7 + €) eventually, and the truth of 
this for all e > 0 implies yr “> y. 
Now write 
srar) — Krh) = KrOTOT = 7); (10.3) 
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where yž is some point between yr and y. If yr € (y—«,y + €), we have 


Ki(y —€) < Kip(yp) < Kpy t+). 


By the law of large numbers, 


Hence x!,(a) “5 K'(a) for all a so that for all sufficiently large T 
K'(Y =€) < kelp) < K'Y +6), 


which implies Kh(yž) > K’ (y). 
Combining (10.3) and Lemma 10.2, it follows that 


on n emer). er) 
een RY a) 
n() = N(0,02/T). 


Theorem 10.1 can be used to obtain error bounds on the ruin probabilities 
when the parameters 3,0 are estimated from data. To this end, first note that 


e TY y Niles acer aE): 
Thus an asymptotic upper a confidence bound for e~7 (and hence by Lund- 
berg’s inequality for w(u)) is 
fa 


= u — yu 
eT" 4 ibe Op 
T 
2 


where 03.7 = 6rkr(2yr)/Kip(yr)* is the empirical estimate of o? and fa 
satisfies ®(—f,) = a (e.g., fa = 1.96 if a = 2.5%). 


Notes and references Theorem 10.1 is from Grandell [428]. A major restriction 
of the approach is the condition Biz] < co which may be quite restrictive. For 
example, if B is exponential with rate ô so that y = ô — G, it means 2(6 — B) < ô, 
i.e. 6 < 28 or equivalently p > 1/2 or 7 < 100%. For this reason, various alternatives 
have been developed. One (see Schmidli [771]) is to let {Vi} be the workload process 
of an M/G/1 queue with the same arrival epochs as the risk process and service times 
U1, U2,..., ie. Vi = St — info<v<t Sv. Letting 


wo =0, Wn = inf {t > wn-1 : Vi =0, Vs > 0 for some t € [wn—1, t]}, 
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the nth busy cycle is then [wn-1, wn), and the known fact that the 


y= max V; 
té[wn—1,¥n) 
are i.i.d. with a tail of the form P(Y > y) ~ Cie~7” (see e.g. Asmussen [65]) can then 
be used to produce an estimate of y. This approach in fact applies also for many 
models more general than the compound Poisson one. 

Further work on estimation of y with different methods can be found in Csörgő & 
Steinebach [268], Csörgő & Teugels [269], Deheuvels & Steinebach [285], Embrechts & 
Mikosch [348], Herkenrath [459], Hipp [464, 465], Frees [371], Mammitzsch [628], Brito 
& Freitas [202], Conti [256] and Pitts, Griibel & Embrechts [707]. 
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Chapter V 


The probability of ruin 
within finite time 


This chapter is concerned with the finite time ruin probabilities 


pu, T) = P(r(u) <T) 
= (at, R< 01 =) 


= P( sup S >u). 
O<t<T 


Only the compound Poisson case is treated; generalizations to other models are 
either discussed in the Notes and References or in relevant chapters. 


The notation is essentially as in Chapter IV. In particular, the premium 
rate is 1, the Poisson intensity is 6 and the claim size distribution is B with 
m.g.f. B|] and mean upg. The safety loading is n = 1/p — 1 where p = pup. 
Unless otherwise stated, it is assumed that 7 > 0 and that the adjustment 
coefficient (Lundberg exponent) y, defined as solution of k(y) = 0 where k(a) = 
G(Bla] — 1) — a, exists. Further let ym be the unique point in (0, y) where «(q) 
attains its minimum value, see Fig. V.1 (the role of yy will be explained in 
Section 4b). 


The claims surplus is {S;}, the time of ruin is T(u) and €(u) = S;(y) — u is 
the overshoot. 
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FIGURE V.1 


1 Exponential claims 


Proposition 1.1 In the compound Poisson model with exponential claims with 
rate v and safety loading n > 0, the conditional mean and variance of the time 
to ruin are given by 


E ßu+1 
} <a, EES 1.1 
Fola] = EE, (1) 
2bvu+ B+v 

Var |T(u) | T(u) << co] = ———. 1.2 
Fola] = SRE (1.2) 
Proof. Let as in Example IV.5.1 Pz, Ez refer to the exponentially tilted process 
with arrival intensity v and exponential claims with rate 8 (thus, pp = v/G = 


1/p > 1). By the likelihood identity IV.(4.8), we have for k = 1,2 that 


i [r(u)*; r(u) < 00] = ip T(u)Ke~ V7) = oe MEpe eM) D7 (u)* 
= œ a zrr(u)® = (u)Epr(u)*, 


using that the overshoot €(u) is exponential with rate 8 w.r.t. Pz and indepen- 
dent of (u). In particular, 


| (u) | T(u) < o0] = E;r(u), Var [7(u) | 7(u) < o0] = Varyr(u). 


For (1.1), we have by Wald’s identity that (note that Ez S+ = t(pz — 1)) 


SLS) = (pr -—1)ErTt(u), 
oy aio wE — But) 
RU pL—1 — vjß-1 v-ß` 
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For (1.2), Wald’s second moment identity yields 


UL (Sru) — (PL — 1)r(u))? = of ErrT(u) 


where o? = K" (y) = 2/8”. Since S;(,) and (pz — 1)r(u) are independent with 
the same mean, the l.h-s. is 


II 


Vart Sru) + Varr ((pr — 1)r(u)) Var é(u) + (pr — 1} Yarrr(u) 


1 v 2 
= +(ž-1) Varyt(u). 


Thus the 1.h.s. of (1.2) is 


eer(u) — 1/62 _ 2v(u+1)/(v—8)-1 
(v/B 1 U- BP 


which is the same as the r.h.s. 


nN 


Oo 


Proposition 1.2 In the compound Poisson model with exponential claims with 
rate v and safety loading n > 0, the Laplace transform of the time to ruin is 
given by 


4 67(u) _ pla-ér(u). — e-psuly _ PS 
e = Efe ; r(u) < co] = e” (1 *) (1.3) 


for 6 > k(%m) = 2\/Bv — B — v, where 


v— bB- 8+ (v -p -— 8)? + 46v 
5 


Proof. It is readily checked that ym = v — Gv and hence that the value of 
k(Ym) is as asserted. 

Let ps > Ym be determined by «(ps5) = 6. This means that G(v/(v — ps) — 
1)— p5 = ô, which leads to the quadratic p?+(8—v+6)ps5—v6 = 0 with solution 
ps (the sign of the square root is + because ps > 0). But by the fundamental 
likelihood ratio identity (Theorem IV.4.3) we have 


ps = 


s[e rlu). (u) < oo] 
= ‘p5 [exp {—dr(u) a P5Sr(u) ae T(u)K(ps) } ; T(u) < oc] 


e PiU e ps&(u)  __ eT Psu Vos 
“PS za ? 
Vps + Ps 


where we used that P,,(7(u) < oo) = 1 because ps > ym and hence Ep; S1 = 
kK'(p5) > 0. Using vp, =v — ps, the result follows. 
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Note that it follows from Proposition 1.3 that we can write 


be OTM) = QP 5Y Re 97 (0)_, (1.4) 


The interpretation of this is that 7(u) can be written as the independent sum 
of 7(0) plus a r.v. Y (u) belonging to a convolution semigroup. More precisely, 


ru) = T+ >> (1.5) 


where r = 7(0) is the length of the first ladder segment, 7), 72,... are the lengths 
of the ladder segments 2,3,..., and M(u) +1 is the index of the ladder segment 
corresponding to r(u). Cf. Fig. V.2, where Yi, Y2,... are the ladder heights 
which form a terminating sequence of exponential r.v.’s with rate v. 


4 Sy a 7(u) 
Yə | 
J Ti 
Yı Ie i i 
u 2 E 
Yı t Yə 
FIGURE V.2 


For numerical purposes, the following formula is convenient by allowing 
(u, T) to be evaluated by numerical integration: 


Proposition 1.3 Assume that claims are exponential with rate v = 1. Then 


plu, T) = Be OA L | ae do (1.6) 
where 
fie) = Bexp {2V/BT cosd — (1+ B)T + u(V/Beos —1)} , 
fo(0) = cos(u/Bsin 8) — cos(uy/B sin 6 + 26) , 
f3(0) 1+8 —2/Bcosé. 
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Note that the case v # 1 is easily reduced to the case v = 1 via the formula 
WPa lu) == Pejy, (vu, vT). 


Proof. We use the formula y(u, T) = P(Vr > u) where {V;} is the workload 
process in an initially empty M/M/1 queue with arrival rate 8 and service 
rate v = 1, cf. Corollary III.3.6. Let {Q,;} be the queue length process of the 
queue (number in system, including the customer being currently served). If 
Qr = N > 0, then Vr = Ui r+ ---+Un,r, where Uj,7 is the residual service 
time of the customer being currently served and U2,7,...,Un 7 the service times 
of the customers awaiting service. Since U7, U2,7,...,Un,r are conditionally 
iid. and exponential with rate v = 1, the conditional distribution of Vp given 
Qr = N is that of Ey where the r.v. Ey has an Erlang distribution with 
parameters (N,1), i.e. density s7 te7®/(N — 1)!. Hence 


pu, T) = P(Vr>u) = X P(Qr = N)P(En > u) 
N= 
ioe) aa k 
= E PQr=N) Ve] 
N=1 k=0 : 
oe) a uk 
= ae a: (1.7) 
For j = 0,1,2,..., let (cf. [4]) 
= (ay eyes 1 4 £ COS aes 


n=0 


denote the modified Bessel function of order j, let [_j;(x) = I;(x), and define 
uy = e +)T 63/21; (2V/BT). Then (see Prabhu [712, pp. 9-12], in particular 
equations (1.38), (1.44); similar formulas are in [APQ, pp. 87-89]) 


5 lj = 1, 
j=—00 

k —k-2 
1 — 5 ty + pert 5 by 


j=— 200 j=—0 


P(Qr >k+1) 


II 


Co 


GEFLE e 3 uj — per 5 lj. 


j=k+1 j=—k-1 
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By Euler’s formulas, 


oo oo ‘fogs BRAT) /2ei(k+D)E 
Ds BI? cos(j) = p3 Bile | _ * ane | 
j=k+1 j=k+1 
R [8 E+D/2ei +0 (G1/2e-i8 = 1)] 
E |8120 — 1)? 
—_ B&*D?? [BP cos(k0) — cos ((k + 1)6)| 
p f3(8) 
œ : œ aN (k+1)/2a—i(k+1)0 
at D Pojo = nf DO pied!) = 2 
j=—k-1 j=—k-1 
$ R [BEV /2¢ —i(k+1)0 Ae —id —1)] 
|8126? = 1)? 
BE+)/2 [61/2 cos ((k + 2)0) — cos ((k + 1)8)] 
= f3(9) l 


Hence the integral expression in (1.8) yields 


P(Qr >k+1) — pro 


_ aori [corre _ B+ /2 [31/2 cos(k0) — cos((k + 2)0)] a 
T Jo fa (0) 
Since P(Q% > k + 1) = B*t?, it follows as in (1.7) that 
oo k 
— —u U k+1 
k=0 
A further application of Euler’s formulas yields 
oo uk ; ue i6) agili 
> gr" cos ((k + 2)0) = ae E Laai = Rerh F250 


k=0 
e28? 0088 cog i sind + 20) ; 


SS A"! cos( 6) Ss [S Cati = perl? 
k=0 ` 


k! 
k=0 


1/2 g 
ere 2888 Gog (up? sin 8) . 
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The rest of the proof is easy algebra. 


Notes and references Proposition 1.3 was given in Asmussen [55] (as pointed 
out by Barndorff-Nielsen & Schmidli [138], there are several misprints in the formula 
there; however, the numerical examples in [55] are correct). Related formulas are in 
Takacs [827]. Seal [785] gives a different numerical integration fomula for 1 — y(u, T) 
which, however, is numerically unstable for large T. 

Alternatively, by using generators one can also represent 7(u, T) as the solution of 
the partial integro-differential equation 


Ov(u,T)  Əplu,T) 
Ou oT 


with boundary conditions limy—.o. w(u,T) = 0 for all T > 0 and w(u,0) = 0 for all 
u > 0. For exponential claims this equation can be transformed into a second-order 
partial differential equation, which in Pervozvansky [694] was solved by Laplace trans- 
formation w.r.t. T and careful applications of the Cauchy residue theorem, resulting 
in an alternative integral representation of (1.6) in terms of trigonometric functions. 


bylu, T) + af v(u—y,T) AB(y) + B(1— B(u)) =0 


2 The ruin probability with no initial reserve 


In this section, we are concerned with describing the distribution of the ruin time 
7(0) in the case where the initial reserve is u = 0. We allow a general claim size 
distribution B and recall that we have the explicit formula (0) = P(7(0) < co) 

We first prove two classical formulas which are remarkable by showing that 
the ruin probabilities can be reconstructed from the distributions of the S;, or, 
equivalently, from the accumulated claim distribution 


(note that P(S, < x) = F(x + t,t)). The first formula, going back to Cramér, 
expresses 7(0,7') in terms of F(-,T), and the next one (often called Seal’s for- 
mula but originating from Prabhu [711]) shows how to reduce the case u 4 0 to 
this. 


1 T 
Theorem 2.1 1—7(0,T) = F F(x, T)dz. 
0 


Proof. For any v € [0,7], we define a new claim surplus process T sain 
by a ‘cyclic translation’, meaning that we interchange the two segments of the 
arrival process of {S;})<,<p corresponding to the intervals [0, v], resp. [v, T]. 
See Fig. V.3. 
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Sı s™ 


FIGURE V.3 
In formulas, 


ge) = St4u — Sy O0<t<T-v 
; Sr — So + Si-r4y T-v<t<T 


Define 
M(v,t) = {8 < s, 0<w<t} 


as the event that foe is at a minimum at time t. Then 


1—4(0,T) = P(r(0)>T) = P(M(0,T)) 
T T 
= a P(M(v,T)) dv = Z f I(M(v,T)) dv, 


where the second equality follows from II.(5.2) with A = (0,00), and the third 
from the obvious fact (exchangeability properties of the Poisson process) that 
{S1} has the same distribution as S; = {S0 } so that P(M (v, T)) does not 
depend on v. 

Now consider the evaluation of J I(M(v,T))dv. Obviously, this integral 
is 0 if Ss) = Sr > 0. If Sp < 0, there exist v such that M(v,T) occurs. For 
example, letting w = inf {t > 0: St- = mino<w<r Sw}, we can take v € (w — 
€,w) for some small e. We claim that if M(0,T) occurs, then M(v,T) = M(0,v). 
Indeed, we can write M(v,T) as 


{Sr < Stv — Sv, O SEST -vN {Sr < Sr- Sy + t-r T—v<t<T} 
{Sr < Se- w, v<E<THN {Sr < Sr- S+S to} 
= {Sr < S — Sv v LEIT} OAOM(0,v) = M(0,v), 


II 
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where the last equality follows from Sr < S+ on M(0,T) and S, < 0 on M(0,v). 
It follows that if 1/(0,T) occurs, then 


T T 
T MOD = g| MO) = -Sr 


(note that the Lebesgue measure of the v for which {S+} is at a minimum at 
v is exactly — Sr on M(0,T)). It is then clear from the cyclical nature of the 
problem that this holds irrespective of whether M (0, T) occurs or not as long 
as Sr < 0. Hence 


1 T 
T [ MoDa 
Seige E “Rs < —r)d 
= T sT T ` T T £ 
1 T 1 T Nr 
al P(Sr<—x)de = 3 f oa T x) de 


Let f(-,t) denote the density of F(-, t). 


T 
Theorem 2.2 1—~(u,T) = F(u+T,T) f (1— Y0, T — t)) f(u + t,t) dt. 
0 


Proof. The event {Sr < u} = { N U; < u+ T} can occur in two ways: either 


ruin does not occur in [0, T], or it occurs, in which case there is a last time o 
where S; downcrosses level u, cf. Fig V.4. 


Here ø € [t,t + dt] occurs if and only if S; € [u,u + dt] and there is no 
upcrossing of level u after time t, which occurs w.p. Y(T — t). Hence 


P(Sp <u) = 1-v(u,T)+ [ro — o(0,T — t))P(S; € [u, u + dé]) , 


which is the same as the assertion of the theorem. 
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FIGURE V.4 


The following representation of 7(0) will be used in the next section. The 
proof will be combined with the proof of Theorem IV.2.2. 


Proposition 2.3 Define 7_(z) = inf {t >0: St = —2z}, z2>0. Let Z be a r.v. 
which is independent of Sı and has the stationary excess distribution Bo. Then 
P(7(0) €-| 7(0) < oo) = P(r_(Z) €-). 


Proof of Theorem IV.2.2. For a fixed T > 0, define Sf = Sr — Sr- and let 


A(z,T) = 10: <0, VEST, Sr- =—z}, 
C(z,T) = {S,>—2,0<t<T, Sr- = —z}, 
C*(z,T) = {SE >-—2z,0<t<T, S= -2z}. 


Then 
P(r(0) € [T,T + dT], —S,)- € [z,z +dz]) = P(A(z,T))8B(z)dzdT. (2.1) 


But by sample path inspection (cf. Fig. V.5), A(z, T) = C*(z,T), and since 
{Stocter {Si }o<t<r have the same distribution, we therefore have P(A(z,T)) 
=] P(C(z; T)). Hence integrating (2.1) yields 


P(—S,(o)- € [z,z +z], T(0) < œ) = BB(z)dz iM P(C(z,T)) dT 


= BB(z)dzP(r_(z) < œ) = BB(z)dz. 
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FIGURE V.5 


Thus 
P (—S-0)- >T, S0) > y; 7(0) < oo) 


= J P(U > y+z|U > z) P(—S,@)_ € [z, z + dz], 7(0) < œ) 


3i Ut an z= pf By+za =f Bede, 


which is the assertion of Theorem IV.2.2. 


Proof of Proposition 2.3. It follows by division by 
P(S,@)- € [z,2 +z], T(0) < œ) = GBB(z) dz 
in (2.1) that 
P(r(0) € [T, T + dT] | Szo- € [z,z + dz], 7(0) < œ) = P(C(z)) aT. 


Hence 


= dT ie P(C(z))P(S,(0)— € [z, z + dz], T(0) < ov) 


= dT a P(C(z))P(Z € [z,z +z], r(0) < œœ) 


= dTP(r_(Z) € [T,T +4aT]). 
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Notes and references For Theorems 2.1, 2.2, see in addition to Prabhu [711] also 
Seal [784, 787]. Theorem 2.1 and the present proof is in the spirit of Ballot theorems, 
cf. Takács [827]; a martingale proof is in Delbaen & Haezendonck [287]. For related 
inequalities for positive u, see De Vylder & Goovaerts [304]. 

Proposition 2.3 was noted by Asmussen & Kliippelberg [86], who instead of the 
present direct proof gave two arguments, one based upon a result of Asmussen & 
Schmidt [103] generalizing Theorem III.5.5 and one upon excursion theory for Markov 
processes (see X.4a). 

For discrete claim size distributions, Picard & Lefévre [701] used generalized Ap- 
pell polynomials to develop recursion formulae for finite time ruin probabilities, see 
also Rulliére & Loisel [757]. This was later extended to more general set-ups includ- 
ing dependent claims, cf. for instance Ignatov & Kaishev [493, 494] and Lefèvre & 
Loisel [575]. Continuous versions of the discrete expressions of [701] are given in De 
Vylder & Goovaerts [303]. 

In the setting of general Lévy processes, some relevant references are Shtatland 
[798] and Gusak & Korolyuk [442]. 


3 Laplace transforms 
As usual, — ps denotes the negative solution of the equation 
k(r) = (Bir]-1)-r = 6. (3.1) 


Let t_(y) be defined as Proposition 2.3. Note that t_(y) < co a.s. because 
of 7 > 0. 


Lemma 3.1 Ee~o7-( = e784, 


Proof. Optional stopping at t_(y) A T of the martingale 


{emrit Pat = fe Pare een 


and letting T — oo using dominated convergence yields 1 = e°?” Ee~°7-), 


Let gs5(x) be the density of the measure Efe"); 7(0) < 00, &(0) € dz] 
(recall that €(0) = S;(0)). 


Lemma 3.2 g(x) = pee e°” B(dy). 


x 


Proof. Let Z be the surplus —S;(9)_ just before ruin. Then by Proposition 2.3, 


`e êr(0) | 7(0) <oo,Z=y] = Ee a eTP3Y, 
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Further by Theorem IV.2.2 
P(Z € [y,y+ dy], €(0) € dr) = B(x + dy) dz 


and hence 


TE f n upper = 8 J 7 erst- B(dy). 


Lemma 3.3 For the Laplace transform Js[-s] = SY e~*" gs(x) dx we have 


a _ k(—8)— s — ô + ps 
95[—s] = Te . 


Proof. 
AEEA J er(-8+Ps) da J e773 B(dy) 
0 x 


oo y 
= ef er B(ay) | et(—stPs) qr 
0 0 


ae J e20 B(dy) fev) — 1] 
= —* (A[-s - Bl-ps)). 
ps —§ 


The result follows by inserting BB] s| = K(—s) + 8 — s and k(—p6) = ô. 


ô 
Corollary 3.4 Efe~*7; (0) < œ) = 1-—. 
På 


Proof. Let b = 0. 


Here is a classical result: the double Laplace transform of the ruin time 7(u): 


Corollary 3.5 f e * sfe amt) tu) < oo] du = r(=8)/5 — 8/ps 
0 k(—s)— ô 


Proof. Define Zs(u) = E[e~*™™; r(u) < oo]. It is then easily seen that Z5(u) 
is the solution of the renewal equation Zs(u) = z(u) + de Zs(u — x) g(x) dx 
where z5(u) = f° g5(x)dx. Hence 


~~ su du E ôr(u). fu oo] = A s| = Zs[-s] = 9510] — gs|—s] 
fe Oa gaara) 
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Using Lemma 3.3, the result follows after simple algebra. 


Notes and references An explicit inversion of the double Laplace transform in 
Corollary 3.5 to obtain expressions for 7)(u, t) in terms of infinite series can be found for 
claim size distributions of mixed Erlang type in Garcia [389] and Dickson & Willmot 
[322], see also Willmot & Woo [893] and Dickson [310]. For a power series expan- 
sion, see e.g. Usabel [859, 860]. An alternative very accurate numerical method is to 
randomize the time horizon T and exploit the resulting additional smoothness of the 
problem, in particular in a matrix-analytic framework, cf. Section IX.8. 

In Chapter XII the results of this section will be extended in various directions in 
the context of Gerber-Shiu functions. 


4 When does ruin occur? 


For the general compound Poisson model, the known results are even less explicit 
than for the exponential claims case, and take basically the form of approxima- 
tions and inequalities. 

The first main result of the present section is that the value umz, where 


1 1 1 C 


m,r = 


K BB'[y] -1 BLEU -1  1-ọ 


is in some appropriate sense critical as the most ‘likely’ time of ruin (here C is 
the Cramér-Lundberg constant). Later results then deal with more precise and 
refined versions of this statement. 


Theorem 4.1 Assume 7 > 0. Then given T(u) < œ, T(u)/u =, my asu > o. 
That is, for any e > 0 


p(|2£ ~ mz] > e|r < æ) — 0. (4.1) 


Further, for any m 


v(u, mu) oy 0 m< mL (4.2) 
w(u) 1 m>mrz. 
For the proof, we need the following auxiliary result: 
Proposition 4.2 Assume 7 < 0, i.e. p= Bug > 1. Then as u > oo, 
a.s. 1 ) 1 
T(u) BS i as T(u) (4.3) 
u p—1 u p—1 
AA EA N(0,w?) where w? = Bu? mè. (4.4) 


Ja 
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Proof. The assumption 7 < 0 ensures that P(t(u) < oo) = 1 and 7(u) “4 o. 
By Proposition IV.1.2, $,/t “4 1/m, and hence a.s. 
T(u) _ T(t) T(u) 


t 
m= jim = mea - dice, T iw 


using (u) = o(u) a.s., cf. Proposition A1.6. This proves the first assertion of 
(4.3). For the second, note that by Wald’s identity 


u+Eg(u) = ES u) = Er(u)-ES: = (p—1Er(u) 


and that E¿(u)/u — 0, cf. again Proposition A1.6. 
For (4.4), note first that (Proposition IV.1.5) 


St = t/m 2 
vt 
According to Anscombe’s theorem (e.g. Theorem 7.3.2 of [246]) and (4.3), the 


same conclusion holds with t replaced by r(u). If Z ~ N(0,1), this can be 
rewritten as 


u + €(u) — T(u)/m 


N Q Bug’) . 


Q 


Bu% Z, implying 


T(u) 

AU Š m Bp Z 2 my pu Z, 
T(u) 

T(u) — mu 2 
Va m3/? Bu Z = wZ. 


Proof of Theorem 4.1. The 1.h.s. of (4.1) is 


pea — mz >€ Tlu) < o0) 


P(T(u) < 00) 
e- E; [e-18™; jr = mz] > €, T(u) < 00 
y(u) 
corm 


= Ole 7) 


By Proposition 4.2, Pz(-) — 0, proving (4.1), and (4.2) follows immediately 
from (4.1). 
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Notes and references Theorem 4.1 is standard, though it is not easy to attribute 
priority to any particular author. Thus, the result comes out not only by the present 
direct proof but also from any of the results in the following subsections. 

For a study of the distribution of the number of claims until ruin, see Egidio dos 
Reis [340]. 


4a Segerdahl’s normal approximation 


We shall now prove a classical result due to Segerdahl, which may be viewed 
both as a refinement of Theorem 4.1 (by considering w(u,T) for T which are 
close to the critical value wm), and as a time-dependent version of the Cramér- 
Lundberg approximation. 


Corollary 4.3 (SEGERDAHL [791]) Let C be the Cramér-Lundberg constant 
and define w? = BLELU? m3 = BB" [y]m}, where my = 1/(pr—1) = 1/(8B'h]- 
1). Then for any y, 


eMy(u,umy + ywr vu) > C&(y). (4.5) 
For the proof, we need the following auxiliary result: 


Proposition 4.4 (STAM’s LEMMA) If 7 <0, then €(u) and T(u) are asymp- 
totically independent in the sense that, letting Z be a N(0,w?) r.v. with w? as 
in (4.4), one has 


T(u) — mu s 5 

FEW) (2) > EE) Eol) (4.6) 
whenever f,g are continuous and bounded on [0,00), resp. (—00, 00). 

Proof. Define u’ = u—u'/4, Then the distribution of r(u) — 7(u’) given Fy) 
is readily seen to be degenerate at zero if S,(,,) > u and otherwise that of 7(v) 
with v = u — Sr(w) =ul/4 — €(r(u’)). Using (4.3), we get 


i[r(u) — r(u’)] = E[r(u’/*—€(w’)); Elu) < u] 
< tr(ul/4) = O(ul/4), 


and thus in (4.6), we can replace t(u) by r(u’). Let h(u) = Ef (€(u)). Then 


h(u) > h(co) = Ef (E(co)), and similarly as above we get 
[E (ECU) | Faw] 
= h(ul/* — ElI (Elu) < u4) + FEl) ut) (El) > u) 
Z h(co) +0, 
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using that u!/4 — €(u’) = oo w.r.t. P because of €(u’) Z €(o0) (recall that 
n <0). Hence 


f 


HE (EL) = EEEO) | Frano (E2 


AT 
3 
< 


Proof of Corollary 4.3. 


e“ (u, umg + ywr Vu) = e™P(r(u) < um, + ywr Ju) 
= E,[e%™; r(u) < um, + ywr vu] 
~ Eres) - Pz (T(u) < umg + ywr Ju) 
> Cè(y), 


where we used Stam’s lemma in the third step and (4.4) in the last. 


For practical purposes, Segerdahl’s result suggests the approximation 


T — um, 

wr y'u ) l 
To arrive at this, just substitute T = umg + ywrvy'u in (4.5) and solve for 
y = y(T). The precise condition for (4.7) to be valid is that T varies with u in 
such a way that y(T) has a limit in (—00, 00) as u — oo. Thus, in practice one 
would trust (4.7) whenever u is large and |y(T)| moderate or small (numerical 
evidence presented in [55] indicates, however, that for the fit of (4.7) to be good, 
u needs to be very large). 


w(u,T) x comm ( (4.7) 


A remarkably sharp and explicit asymptotic result in terms of the time hori- 
zon T is the following: 


Theorem 4.5 For every fixed u > 0, we have, as T —> œ, 

pu) — plu, T) ~ Cre me lKOm IP p-3/2 (14 H(u)), (4.8) 
where C = (278B" ml) ey? and H(u) is a renewal function which satisfies 
H(u) ~ 2(BB" Ym] "u as u — œ. 


The proof is quite involved and uses deep results from random walk theory; we 
refer to Teugels [841]. 
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Notes and references Corollary 4.3 is due to Segerdahl [791]. The present proof is 
basically that of Siegmund [806]; see also von Bahr [121] and Gut [443]. For refinements 
of Corollary 4.3 in terms of Edgeworth expansions, see Asmussen [55] and Malinovskii 
[625]. Cf. also Höglund [478]. 


4b Gerber’s time-dependent version of Lundberg’s inequal- 
ity 
For y > 0, define ay, Yy by 
1 


K' (dy) = y Yy = ay — yk(ay). (4.9) 


Note that ay > Ym and that yy > y (unless for the critical value y = 1/mz), cf. 
Fig. V.1. 


Theorem 4.6 
yplu,yu) < e %, pia (4.10) 
pa-p) S nt, y> TH (4.11) 


Proof. Consider first the case y < 1/K'(y). Then k(ay) > 0 (see Fig. V.1), and 
hence 


wu, yu) = enw Bo fe ayg(u)+r(u)s(ay). T(u) < yu] 


< @ tt ney [e7 (en). T(u) < yul a ety utyuK (ay) 


Similarly, if y > 1/K’(y), we have K(a,) < 0 and get 


Y(u)— plu, yu) = e WME, [est 709); yu < r(u) < o] 
< eE, ferlo); yu <T(u) < 00] 
—ayu+yur (ay) | 


IA 


e 


Remark 4.7 It may appear that the proof uses considerably less information 
on a, than is inherent in the definition (4.9). However, the point is that we want 
to select an a which produces the largest possible exponent in the inequalities. 
From the proof it is seen that this amounts to that a should maximize a— yxK(a). 
Differentiating w.r.t. a, we arrive at the expression in (4.9). 


4. WHEN DOES RUIN OCCUR? 133 


In view of Theorem 4.6, yy is sometimes called the time-dependent Lundberg 
exponent. 

An easy combination with the proof of Theorem IV.6.3 yields the following 
sharpening of (4.10): 


Proposition 4.8 ~(u, yu) < Cy(a,)e—%" where 


B(x) 
z>0 k e%u(¥—2) B(dy) i 


Notes and references Theorem 4.6 is due to Gerber [397], who used a martingale 
argument. For a different proof, see Martin-Löf [631]. Numerical comparisons are in 
Grandell [430]; the bound e~ 7” turns out to be rather crude, which may be understood 
from Theorem 4.9 below, which shows that the correct rate of decay of y(u, yu) is 
e7% /yu. 

Some further discussion is given in XVI.2, and generalizations to more general 
models are given in Chapter VII. Höglund [477] treats the renewal case. 


4c Arfwedson’s saddlepoint approximation 


Our next objective is to strengthen the time-dependent Lundberg inequalities 
to approximations. As a motivation, it is instructive to reinspect the choice 
of the change of measure in the proof, i.e. the choice of ay. For any a > Ym, 
Proposition 4.2 yields 


Le., if we want EaT(u) ~% T, then the relevant choice is precisely a = a, where 
y = T/u. We thereby obtain that T is ‘in the center’ of the P,-distribution of 
T(u). This idea is precisely what characterizes the saddlepoint method. 

The traditional application of the saddlepoint method is to derive approxi- 
mations, not inequalities, and in case of ruin probabilities the approach leads to 
the following result: 


Theorem 4.9 If y < 1/k’(y), then the solution ay < ay of K(@) = K(ay) is 
<0, and 


=A —yyu 
Qy — Qy peT 


-= , u>. (4.12) 
oyl&y|\/2ryBB"[ay] V" 


plu,yu) ~ 
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If y > 1/k'(y), then Gy > 0, and 


~@ —Yyu 
Qy — Qy se 


aya, \/2nyBB" la] vu 


vu) — plu, yu) ~ u— o. (4.13) 


Proof. In view of Stam’s lemma, the formula 


payi =E Ea [e R a a < yu] 


suggests heuristically that 


plu, yu) ~% e "Eye aslo). Ra [e7 row) T(u) < yul. (4.14) 


Y 


Here the first expectation can be estimated similarly as in the proof of the 
Cramér-Lundberg’s approximation in Chapter IV. Using Lemma IV.5.7 with P 
replaced by Pg, and Pz by Pa,, we have ya, = ay — Qy and get 


Aa 


a Ba, Ya, — %y| — 1 
aye ayE(oo) — a y (1 Be, Ale y Oy] ) 
Oy Ke (Ya) Va, T Qy 
sa a a e 
ayk! (Ya, + Ay) Qy 
_ ar (ipg a el) 
yk’ (dy) Qy 
z y(Qy — Gy) Ay + B(1— Bla,]) 
Qy Qy 
= —y(ay — Ay) K(Qy) a y(Qy — Ay)K (ay) 
Qy Qy |Qy| 


For the second term in (4.14), it seems tempting to apply the normal approxi- 
mation (4.4). Writing T(u) ~ yu + ut/?wV, where V is normal(0,1) under Pa, 
and 


w = Bayle, [Pay = 1)" = AB lonla —1)° = 0B" ay], 
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we get heuristically that 


a [e7 slan), t(u) < yu] 


= verlay)E, [eran 


1/2 


WV. y < 0] 


0 


ayaa i e*9(z/(K(ay)u'/?w)) dz 


1 SR 1 
e ernan ir f E — dz 
0 


K(Qy)ul/2w 
1 
klay) V27uw? | 


— gyun(ay) 


= gyun(ay) 


Inserting these estimates in (4.14), (4.12) follows. The proof of (4.13) is 
completely similar. 


The difficulties in making the proof precise is in part to show (4.14) rigor- 
ously, and in part that for the final calculation one needs a sharpened version 
of the CLT for y(u) (basically a local CLT with remainder term). 


Example 4.10 Assume that B(x) = e~””. Then k(a) = B(v/(v— a) — 1) —a, 
k'(a) = Ba/(v — a)? — 1, and the equation k’(a) = 1/y is easily seen to have 


solution 
Bv 
Ay =V— 
Y 1+1/y 


(the sign of the square root is negative because the c.g.f. is undefined for a > v). 
It follows that 


= = pv = = bv 
Vay ZSV Qy = 1+1/y’ Ba, =B+ay=B+u— 1+1/y’ 
ee |) Bv Pe Bv 
= = Pa Oy, 2 ’ a Zø, 
Qy — Ay = Ba, — Va, 5B HV EST y i+ 1/y B 
2 2y!/2(1 3/2 
Pej- Z — — Py) 
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and (4.12) gives the approximation 


wi (ose— ari) 


(v ret) [E ayya “A 


for y(u, yu) when y < 1/k’(y) = p/1 — p. 


e yy 


Notes and references Theorem 4.9 is from Arfwedson [51]. A related result 
appears in Barndorff-Nielsen & Schmidli [138]. 


5 Diffusion approximations 


The idea behind the diffusion approximation is to first approximate the claim 
surplus process by a Brownian motion with drift by matching the two first 
moments, and next to note that such an approximation in particular implies 
that the first passage probabilities are close. 

The mathematical result behind is Donsker’s theorem for a simple random 
walk {S7},—0,1,,. in discrete time: if u = ES} is the drift and o? = Var( Sž) 
the variance, then 


(Ete - tow}, 2 MOa eo, (5.1) 


where {W¢(t)} is Brownian motion with drift ¢ and variance (diffusion constant) 


1 (here Z refers to weak convergence in D = D[0,0o)). 

It is fairly straightforward to translate Donsker’s theorem into a parallel 
statement for continuous time random walks (Lévy processes), of which a partic- 
ular case is the claim surplus process (see the proof of Theorem 5.1 below). How- 
ever, for the purpose of approximating ruin probabilities the centering around 
the mean (the tcu term in (5.1)) is inconvenient. We want an approximation 
of the claim surplus process itself, and this can be obtained under the assump- 
tion that the safety loading 7 is small and positive. This is the regime of the 
diffusion approximation (note that this is just the same as for the heavy traffic 
approximation for infinite horizon ruin probabilities studied in IV.7c). 

Mathematically, we shall represent this assumption on 7 by a family {gi ) } U 
of claim surplus processes indexed by the premium rate p, such that the claim 
size distribution B and the Poisson rate 8 are the same for all p (i.e., S; = 
Da U; — tp), and consider the limit p | p, where p is the critical premium rate 
Bus. 
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Theorem 5.1 As p | p, we have 
Hl (p) 2 
A ? (Wal) is (5.2) 


2 
where p = fp = p- p, 0? = Bul). 


Proof. The first step is to note that 


fats! — tem) } = {ag (sif) — pct)} Zawory (5.3) 


whenever c = cp Î œ as p | p. Indeed, this is an easy consequence of (5.1) with 
S* = S® and the inequalities 


SY ple < SP < SP yj +pple, nje<t < (n+1)/e, 


W 


cf. Lemma IV.1.3. 
Letting c = o? fuz, (5.3) takes the form 


2 

{lel 5) +t} 4 {Wolt)}, 
2 
2 


(ARa) 


Now let 


T. 


plu) = inf{t>0: sP >u}, teu) = inf{t>0: Welt) >u}. 
It is well-known (Corollary III.1.6 or [APQ, p. 263]) that the distribution IG(-; Ç; u) 


of T¢(u) (often referred to as the inverse Gaussian distribution) is given by 


IG(#;¢;u) = P(re(u) <a) = 1-@( 7 —¢Vz) +e%"@(— Cy). (5.4) 


Note that IG(-;¢; u) is defective when ¢ < 0. 


Corollary 5.2 Asp | p, 


uo? To? 
Y (=.=) — IG(T;-1;u). 
aL ae 


138 CHAPTER V. PROBABILITY OF RUIN IN FINITE TIME 


Proof. Since f — supo<r<r f(t) is continuous on D a.e. w.r.t. any probabil- 
ity measure concentrated on the continuous functions, the continuous mapping 
theorem yields 


Hla) 2 
sup =S => sup W(t). 
eee Cae epee it) 


Since the r.h.s. has a continuous distribution, this implies 


P( sup lal ct) > u) = P( sup W_,(t) > u). 
0<t<T T 0<t<T 


But the Lh.s. is Yp (uo?/|p|,To?/p?), and the r.h.s. is IG(T; —1; u). 


For practical purposes, Corollary 5.2 suggests the approximation 
Hu, T) = IG(T.2/0?; ulpl/o?). (5.5) 
Note that letting T — oo in (5.5), we obtain formally the approximation 
y(u) ~ IG(o; ulu|/o?) = e72, (5.6) 


This is the same as the heavy-traffic approximation derived in IV.7c. However, 
since (u) has infinite horizon, the continuity argument above does not gener- 
alize immediately, and in fact some additional arguments are needed to justify 
(5.6) from Theorem 5.1. Because of the direct argument in Chapter IV, we omit 
the details; see Grandell [426], [427] or [APQ, pp. 196, 199]. 

Checks of the numerical fits of (5.5) and (5.6) are presented, e.g., in Asmussen 
[55]. The picture which emerges is that the approximations are not terribly 
precise, in particular for large u. In view of the excellent fit of the Cramér- 
Lundberg approximation, (5.6) therefore does not appear to be of much practical 
relevance for the compound Poisson model. However, for more general models 
it may be easier to generalize the diffusion approximation than the Cramér- 
Lundberg approximation; as an example of such a generalization we mention 
the paper [342] by Emanuel et al. on the premium rule involving interest. In 
contrast, the simplicity of (5.5) combined with the fact that finite horizon ruin 
probabilities are so hard to deal with even for the compound Poisson model 
makes this approximation more appealing. However, in the next subsection we 
shall derive a refinement of (5.5) for the compound Poisson model which does 
not require much more computation, and which is much more precise. 

We conclude this section by giving a more general triangular array version 
of Theorem 5.1. The proof is a straightforward combination of the proof of 
Theorem 5.1 and Section VIII.6 of [APQ]. 
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Theorem 5.3 Consider a family fg) of claim surplus processes indexed by 
a parameter 0, such that the Poisson rate Ba, the claim size distribution Bg and 
the premium rate pọ depends on 0. Assume further that BoB, < po, that 


Z 
Bo — boo, Bo > Ba, Po — Poo, po — bol Be — 0, 


as 0 — Oo and that the U? are uniformly integrable w.r.t. the Bọ. Then as 
0 — 0o, we have 


[a| 2 
a beso = Wal} (5.7) 


where u = uo = po — Po = BoBo — Po, go = 05 = boug. 


Notes and references Diffusion approximations of random walks via Donsker’s 
theorem is a classical topic of probability theory. See for example Billingsley [167]. The 
first application in risk theory is Iglehart [492], and two further standard references in 
the area are Grandell [426], [427]. All material of this section can be found in these 
references. 

For claims with infinite variance, Furrer, Michna & Weron [383] suggested an ap- 
proximation by a stable Lévy process rather than a Brownian motion. Further relevant 
references in this direction are Furrer [382], Boxma & Cohen [194] and Whitt [883]. 


6 Corrected diffusion approximations 


The idea behind the simple diffusion approximation is to replace the risk process 
by a Brownian motion (by fitting the two first moments) and use the Brown- 
ian first passage probabilities as approximation for the ruin probabilities. Since 
Brownian motion is skip-free, this idea ignores (among other things) the presence 
of the overshoot €(u), which we have seen to play an important role for exam- 
ple for the Cramér-Lundberg approximation. The objective of the corrected 
diffusion approximation is to take this and other deficits into consideration. 

The set-up is the exponential family of compound risk processes with pa- 
rameters 3, Bg constructed in IV.4. However, whereas there we let the given 
risk process with safety loading 7 > 0 correspond to 0 = 0, it is more convenient 
here to use some value 0) < 0 and let 0 = 0 correspond to 7 = 0 (zero drift); 
this is because in the regime of the diffusion approximation, 7) is close to zero, 
and we want to consider the limit 7 | 0 corresponding to 69 Î 0. 

In terms of the given risk process with Poisson intensity Ø, claim size dis- 
tribution B, s(a) = 6(Bla] — 1) — a and p = Bug < 1,7 =1/p—1> 0, this 
means the following: 


1. Determine ym > 0 by K'(Ym) = 0 and let 09 = — Ym. 
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2. Let Po refer to the risk process with parameters 


pa —ĝox 
Bo = BB|-0], Bo(dx) = Aj gP 
a0) 


Then EyU* = B [0] = B® [69] /B[—Oo] and rols) = «(s— 0) —«(—90), 
kh(0) = 0. 


3. For each 6, let Po refer to the risk process with parameters 


se A Ox (0-80) 
Bo = PoBol6] = BBlO—O], Bo(dx) = Bo = Bona) 
Then 

kols) = kols +0) — ko(0) = K(s +8 — 09) — K(O — 00) 
and the given risk process corresponds to Pg, where ĝo = —Ym- 


In this set-up, Po (r(u) < oo) =1 for 0 > 0, Po(T(u) < ov) < 1 for 6 < 0, and 
we are studying Y(u, T) = Po, (T(u) < T) for 0 < 0, 8o 1 0. 

Recall that IG(a; ¢; u) denotes the distribution function of the passage time 
of Brownian motion {W¢(t)} with unit variance and drift ¢ from level 0 to level 
u > 0. One has 


IG(2;¢;u) = IG(x/u?;Cu;1). (6.1) 
The corrected diffusion approximation to be derived is 
Tn 4 yu V2 
pu, T) x 1G( eee ) (6.2) 
u u 2 u 


where as ususal y > 0 is the adjustment coefficient for the given risk process, 
i.e. the solution of «(y) = 0, and 


oJ? BN Fm] 
BEoU? 3B" Yn). 


vi = BEU? = BB" lm], ve = 


Write the initial reserve u for the given risk process as u = ¢/09 (note that 
¢ < 0) and, for brevity, write T = r(u), € = €(u) = S+ — u. The first step in the 
derivation is to note that 


Varg, 51 ~~ Varo S$ = Bo ig? = VY, Oo Î 0. 
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Theorem 5.3 applies and yields 


(E suue} y 2 Wah 


UV 


which easily leads to 
Q 
(Se Fy so Wey O hzo 


blu, tu?) > IG(t Cyr; = IG(tr; &; 1). 


/ ) 
V 
Since 


/ e~'IG(dt;¢,u) = eO where A(A,C) = /2A4+ C2 — (6.3) 
0 


this implies (take u = 1) 


6) exp{—AMT(u)/u7} > eTO | (6.4) 


The idea of the proof is to improve upon this by an O(u~') term (in the following, 
~ means up to o(u—!) terms): 


Proposition 6.1 As u — œ, 09 T O in such a way that Ç = Oou is fixed, it 
holds for any fixed A > 0 that 


o exp{—A7(u)/u7} ~ exp{—A(A, —yu/2)(1 + vp/u)} {1 + Ava) . (6.5) 


Once this is established, we get by formal Laplace transform inversion that 
t 2 
y(u, —) N IG(t H 2, ce 1+ =) ; 
Vy u 
Indeed, the r.h.s. is the c.d.f. of a (defective) r.v. distributed as Z — v2 /u where 


Z has distribution IG (- ; —yu/2; 1 + v2/u). But the Laplace transform of such 
a r.v. is 


AZ Ave /u x 


be 84 [1 + Ava/u] 


where the last expression coincides with the r-h.s. of (6.5) according to (6.3). 
To arrive at (6.2), just replace t by Tr /u?. 

Note, however, that whereas the proof of Proposition 6.1 below is exact, the 
formal Laplace transform inversion is heuristic: an additional argument would 
be required to infer that the remainder term in (6.2) is indeed o(u~'). The 


ue 
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justification for the procedure is the wonderful numerical fit which has been 
found in numerical examples and which for a small or moderate safety loading 
7 is by far the best among the various available approximations [note, however, 
that the saddlepoint approximation of Barndorff-Nielsen & Schmidli [138] is a 
serious competitor and is in fact preferable if 7 is large]. A numerical illustration 
is given in Fig. V.5, which is based upon exponential claims with mean pp = 1. 
The solid line represents the exact value, calculated using numerical integration 
and Proposition 1.3, and the dotted line the corrected diffusion approximation 
(6.2). In (1) and (2), we have p = 8 = 0.7, in (3) and (4), p = 0.4. The initial 
reserve u has been selected such that the infinite horizon ruin probability y(u) 
is 10% in (1) and (3), 1% in (2) and (4). 


a) 
0.14 y(u.T) er er oa a 0.01 


2) 
wut) 


0.08 0.008 


0.06. 0.006: 


0.004 


0.04 


0.002 
0.02 


0 do 80 120 160 200 240 280 T 40 80 120 160 200 240 280 T 


6) (3) 
(wT) wut) 
y 
0.01 


0.1 


0.09 
0.008 


0.08 


0.074 j 0.006- 


0.06 
0.004 


0.05 


0.04. 
0.002 


0 20 40 60 80 T 100 o 20 40 60 80 T 100 


FIGURE V.5 


It is seen that the numerical fit is extraordinary for p = 0.7. Note that the 
ordinary diffusion approximation requires p to be close to 1 and w(u) to be not 
too small, and all of the numerical studies the authors knows of indicate that 
its fit at p = 0.7 or at values of w(u) like 1% is unsatisfying. Similarly, the fit 
at p = 0.4 may not be outstanding but nevertheless, it gives the right order 
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of magnitude and the ordinary diffusion approximation hopelessly fails for this 
value of p. For further numerical illustrations, see Asmussen [55], Barndorff- 
Nielsen & Schmidli [138] and Asmussen & Højgaard [81]. 

The proof of Proposition 6.1 proceeds in several steps. 


Lemma 6.2 Em) 1 exp{ h(, of a aes (6° Os í 


u u2? 2u3 


Proof. For 0 > 0, 


1 = P5(t <œ) = Es exp{(0 — o)(u +£) — T(Ko(®) — Ko(0))} - 


Replacing @ by 6/ u and ĝo by ¢/u yields 


e0-0 = Eg, exp{ (f — C)&/u— 7 (Ko (8/u) — Ko(G/u)) } - 


Let 0 = (2A + €2)!/2 = A(A, C) + Ç and note that 


1 1 2 3 
ko(0) = 50° BoEgU? + 20° 6BoEoU? +--+ = - i eee t. (6.6) 
2 6 2 2 
Using 6? — C? = 2X, the result follows. 
oU? 
Lemma 6.3 lim E(u) = Epé(oo) = n = f 
i900 3E U2 
Proof. By partial integration, the formulas 
7 1 as 
Po(€(0) >a) = Po(Sro) >a) = EU Po(U > y)dy, 
1 Co 
Pollo) >a) = gore | Po(E(0) > 9) dy 
imply 
: 7 yk+1 : y 0 k+1 
O ea e A 
(k + 1)EoU (k + 1)Eo§(0) 


Lemma 6.4 Ep, epf- } 
u 
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Proof. It follows by a suitable variant of Stam’s lemma (Proposition 4.4) that 
the r.h.s. in Lemma 6.2 behaves like 


: AVIT i E VVT 73 3 
opf- 5 aoi -E E -e| 
À 
~ By,exp{——5- h t AORA 2 
Va ~ VIT AVIT 
ze L] sao | a exp{ s H: (6.7) 
The last term is approximately 
Ya 9 EE A 
2u (4 £ axe 
_ V2 2 Se —A(A,C) 
2u [2 C = axe gel 
PS v2 2 G V2 
ee 2u [2 7 Oat Bi exp{ WA O(1 + aG 


The result follows by combining Lemma 6.2 and (6.7) and using 


oT RA) — HAORA Z w exp{—h(A,¢)(1+ mole 


The last step is to replace h(A,¢) by h(A, —yu/2). There are two reasons for 
this: in this way, we get the correct asymptotic exponential decay parameter 
y in the approximation (6.2) for y(u) (indeed, letting formally T — oo yields 
y(u) = C’e~™ where C’ = e72); and the correction terms which need to be 
added cancels conveniently with some of the more complicated expressions in 
Lemma 6.4. 


Lemma 6.5 exp{ —h(,Q)(1 t as 


V2 V2 3 2 
x efaa AHi - Bo e) 


Proof. Use first (6.6) and ko(fo) = koly + 80) to get 


V2 


2 


p ak 
0 = Ba? 4276) + PERG +3978 + 3968) + OCW), 


Z+ oo = —F (7? + 340o + 363) + O(u-*). 
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Thus y = —269 + O(u7?), and inserting this and 0) = ¢/u on the r.h.s. yields 


= exp{—h(d,0)(1 
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Proof of Proposition 6.1: Just insert Lemma 6.5 in Lemma 6.4. 


Notes and references Corrected diffusion approximations were introduced by 
Siegmund [809] in a discrete random walk setting, with the translation to risk processes 
being carried out by Asmussen [55]; this case is in part simpler than the general random 
walk case because the ladder height distribution G+ can be found explicitly (as pBo) 
which avoids the numerical integration involving characteristic functions which was 
used in [809] to determine the constants. 

In Siegmund’s book [810], the approach to the finite horizon case is in part different 
and uses local central limit theorems. The adaptation to risk theory has not been 
carried out. 

The corrected diffusion approximation was extended to the renewal model in As- 
mussen & Højgaard [81], and to the Markov-modulated model of Chapter VII in 
Asmussen [58]; Fuh [379] considers the closely related case of discrete time Markov 
additive processes. 

Hogan [473] considered a variant of the corrected diffusion approximation which 
does not require exponential moments. His ideas were adapted by Asmussen & Bin- 
swanger [72] to derive approximations for the infinite horizon ruin probability y(u) 
when claims are heavy-tailed; the analogous analysis of finite horizon ruin probabili- 
ties y(u, T) has not been carried out and seems non-trivial. 

For corrected diffusion approximations with higher-order terms, see Blanchet & 
Glynn [174]; their results also cover some heavy-tailed cases. 
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7 How does ruin occur? 


We saw in Section 4 that given that ruin occurs, the ‘typical’ value (say in sense 
of the conditional mean) was umz, that is, the same as for the unconditional 
Lundberg process. We shall now generalize this question by asking what a 
sample path of the risk process looks like given it leads to ruin. The answer is 
similar: the process behaved as if it changed its whole distribution to Pz, i.e. 
changed its arrival rate from £ to 8z and its claim size distribution from B to Bz. 
Recall that F,(u) is the stopping time o-algebra carrying all relevant information 
about 7(u) and {Si}o<y<7(y)- Define P(™ = P(-|r(u) < 00) as the distribution 
of the risk process given ruin with initial reserve u. We are concerned with 
describing the P“)-distribution of {St}o<r<r(u) (note that the behavior after 
is just an 


T(u) is trivial: by the strong Markov property, {Sru — Sre Tisa 


independent copy of {St};>0)- 


Theorem 7.1 Let {F(u)} >o 
satifying PLF (u) > 1, u— œ. Then also P™ F(u) > 1. 


be any family of events with F(u) E€ F(u) and 


Proof. 
(u) ue = P(F(u)°; T(u) < oo) np UL [e779 ; F(u)°] 
POW = O eo) plu) 
a Ben r ER _, 
7 v(u) Ce- 


Corollary 7.2 If B is exponential, then P™ and Pz coincide on 
Fry = ali rhaa) 


Proof. Write e197) = e~V@e-1S™), In the exponential case, Fr(u)— and €(u) 
are independent, so in the proof, the numerator becomes 


e7 Ere SMP, (F(u)) = eMCP,(F(u)’) 


when F(u) € F, (u)- and similarly the denominator is exactly equal to Ce~™". 


Note that basically the difference between ¥,(,) and Fr(u)— is that €(u) is not 
F,(4)--measurable. In fact, €(u) is exponential with rate v w.r.t. P and rate 
b = v wort. Py. 
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As example, we give a typical application of Theorem 7.1, stating roughly 
that under P™), the Poisson rate changes from £ to Gy; and the claim size dis- 
tribution from B to By. Recall that 8, = @Bly] and B, (dz) = e7 B(dx)/ By], 
and let M(u) be the index of the claim leading to ruin (thus T(u) = Ty + Tə + 


->+ Tru): 


Corollary 7.3 


1 ue pt) 
I(T, < x) 1 — efr? 
1 M(u) po 
(Up <x) —> Blz) 
M(u) A 


The proof of the second is similar. 
We finally consider the limiting joint distribution of 
¢(u) = u— Sr(u)— = Ry(u)- and E(u y= Sr(u) — u= —Rzr(u) 


(the surplus prior to ruin, resp. the deficit at ruin). 


Proposition 7.4 Under the conditions of the Cramér- L approximation, 
(C(u), €(w)) has a proper limit (¢(o0), €(00)) as u — 00 in P™ -distribution. The 
limiting distribution is given by 


P(¢(00) € da, E(00) > y) = Be + y)(e’ — 1). 


Proof. Define Z(u) = P(¢(u) € da, €(u) > y). Consider first the case where 
¢(u) € dz, E(u) > y occurs in the first ladder step, illustrated in Fig. V.6. 
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For occurrence in the first ladder step, we need x > u and ¢(0) = u — y. Also, 
by Theorem IV.2.2 the (defective) density of ¢(0) is GB(z) and if ¢(0) = 0, the 
available information on €(0) given ¢(0) = z is that it has the distribution of a 
claim U conditioned to exceed z. Thus, taking z = u— x, we get the contribution 
to Z(u) from the first ladder step as 


BB(y- eerie <x) = BB(xt+y)I(u<z). 


If €(0) = z < u, everything repeats itself from z, and thus, since also €(0) has 
density 6B(z), 


Zu) = BB(at+y)I(u<z) + [au 29682) az, 


This is a defective renewal equation, and since «(7) = 0 implies ia e BB(v) dv = 
1, the usual exponential technique gives that e™Z(u) has the limit 


eM —1 yC 

y 1-p’ 
where C is the Cramér-Lundberg constant (for the last equality, see the calcula- 
tions around IV.(5.8) and IV.(5.11)). Thus Z(u)/Ce~™ has the asserted limit 


which is what was to show. 
That the limit is proper follows from 


o Poe” —1)dz 


BB(xt+y) [ e7” au/ BB(vjve™ dv = BB(x +y) 


1_ 
= (fo Beea — p) = = (F(t 1) p) 
oped fea as Ay 


ae 
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Notes and references Proposition 7.4 can be found in Schmidli [773]. It will 
be shown in X.4 that in the heavy-tailed case, ¢(u) and €(w) need to be scaled down 
before a conditional limit can be obtained. 

The remaining results of the present section are part of a more general study carried 
out by the first author [54]. A somewhat similar study was carried out in the queueing 
setting by Anantharam [48], who also treated the heavy-tailed case; however, the 
queueing results are of a somewhat different type because of the presence of reflection 
at 0. 

From a mathematical point of view, the subject treated in this section leads into 
the area of large deviations theory. This is currently a very active area of research, see 
further XIII.1. 

Convergence properties of empirical finite-time ruin probabilities are investigated 
in Loisel, Mazza & Rulliére [608]. 

Schmidli [781] derives some exact expressions for the distribution of the risk process 
given that ruin occurs. These may be seen as special h-transform calculations for 
piecewise deterministic Markov processes (recall that the distribution of a Markov 
chain or Markov process conditioned to hit a set is always a h-tranform, see e.g. 
Asmussen & Glynn [79, VI.7]). 
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Chapter VI 


Renewal arrivals 


1 Introduction 


The basic assumption of this chapter states that the arrival epochs oj, 02,... 
of the risk process form a renewal process: letting Tn = On — On-1 (Ti = 
01), the Tn are independent, with the same distribution A (say) for T2, T3,... 
In the so-called zero-delayed case, the distribution A, of Tı is A as well. A 
different important possibility is A, to be the stationary delay distribution Ag 
with density A(x)/j4. Then the arrival process is stationary which could be 
a reasonable assumption in many cases (for these and further basic facts from 
renewal theory, see A.1). 

We use much of the same notation as in Chapter I. Thus the premium rate 
is 1, the claim sizes U4, U2,... are i.i.d. with common distribution B, {S+} is the 
claim surplus process given by I.(1.5), with 

Ni = #{n: on < th 
the number of arrivals before t, and M is the maximum of {S;}, T(u) the time 
to ruin. The ruin probability corresponding to the zero-delayed case is denoted 


by y(u), the one corresponding to the stationary case by (°) (u), and the one 
corresponding to T; = s by w.(w). 


Proposition 1.1 Define p = PB Then regardless of the distribution A; of T), 
HA 


S s 
er a a n 
Yy S Zora 2 aa 
lim a OE Hae (1.2) 
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Furthermore for any a > 0, 


dim E[St+a— Si] = a(p—1). (1.3) 
Proof. Obviously, 
Ne 
iS, = E [Sv N| —t = EN, -uB — t. 
{= 


However, by the elementary renewal theorem (cf. A.1) EN;/t —> 1/4. From 
this (1.1) follows, and (1.3) follows similarly by Blackwell’s renewal theorem, 
stating that E[ Nita — Ni] > a/na. 

For (1.2), we get in the same way by using known facts about EN; and 
Var N; that 


Nt 
Var(S;) = Var ADD U; N:] + i Var[5 U; 
i=1 i 
= Var(uB Ni) + (oh Ni) 
om a? 
= tu- +t +o(t). 
HB MA ma ( ) 


Of course, Proposition 1.1 gives the desired interpretation of the constant p 
as the expected claims per unit time. Thus, the definition 7 = 1/p — 1 of the 
safety loading appears reasonable here as well. 

The renewal model is often referred to as the Sparre Andersen process, af- 
ter E. Sparre Andersen whose 1959 paper [816] was the first to treat renewal 
assumptions in risk theory in more depth. The simplest case is of course the 
Poisson case where A and A; are both exponential with rate 3. This has a 
direct physical interpretation (a large portfolio with claims arising with small 
rates and independently). Here are two special cases of the renewal model with 
a similar direct interpretation (see also the discussion in I.3): 


Example 1.2 (DETERMINISTIC ARRIVALS) If A is degenerate, say at a, one 
could imagine that the claims are recorded only at discrete epochs (say each 
week or month) and thus each U, is really the accumulated claims over a period 
of length a. 


Example 1.3 (SWITCHED POISSON ARRIVALS) Assume that the process has 
a random environment with two states ON, OFF, such that no arrivals occur in 
the off state, but the arrival rate in the ON state is 8 > 0. If the environment 
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is Markovian with transition rate À from ON to OFF and yz from OFF to ON, the 
interarrival times become i.i.d. (an arrival occurs necessarily in the ON state, and 
then the whole process repeats itself). More precisely, A is phase-type (Example 
1.2.4) with phase space {ON,OFF}, initial vector (1 0) and phase generator 


Eare 


However, in general the mechanism generating a renewal arrival process ap- 
pears much harder to interpret in the risk theory context and therefore the 
relevance of the model has been questioned repeatedly, see the discussion in I.3. 
Nevertheless we will present at least some basic features of the renewal model, if 
only for the mathematical elegance of the subject, the fundamental connections 
to the theory of queues and random walks, and for historical reasons. 

The following representation of the ruin probability (already discussed in 
Section IV.1) will be a basic vehicle for studying the ruin probabilities: 


Proposition 1.4 The ruin probabilities for the zero-delayed case can be repre- 
sented as (u) = P(M® > u) where M® = max{ S4? : n = 0,1,...} with 
foi) a discrete time random walk with increments distributed as the indepen- 
dent difference U — T between a claim U and an interarrival time T. 


Proof. The essence of the argument is that ruin can only occur at claim times. 
The values of the claim surplus process just after claims has the same distri- 
bution as foie, Since the claim surplus process {.5;} decreases in between 
arrival times, we have 


max S; = max So), 
0<t<oo n=0,1,... 


From this the result immediately follows. 


For later use, we note that the ruin probabilities for the delayed case T; = s 
can be expressed in terms of the ones for the zero-delayed case as 


— uts 
Ha(u) = Bluts) + | plu + s- y)B(dy). (1.4) 


Indeed, the first term represents the probability P(U; — s > u) of ruin at the 
time s of the first claim whereas the second is P(r(u) < œ,Uı -s < u), as 
follows easily by noting that the evolution of the risk process after time s is 
that of a renewal risk model with initial reserve U1 — s. For the stationary case, 
integrate (1.4) w.r.t. Ao. 
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2 Exponential claims. The compound Poisson 
model with negative claims 
We first consider a variant of the compound Poisson model obtained essentially 


by sign-reversion. That is, the claims and the premium rate are negative so that 
the risk reserve process, resp. the claim surplus process are given by 


Ny N? 
Ry =u+) UZ-t, Sf = t-S U}, 
t=1 i=l 


where {N;‘} is a Poisson process with rate 8* (say) and the U; are independent 
of {Nz} and i.i.d. with common distribution B* (say) concentrated on (0,00). 
This model is sometimes referred to as the dual risk model in the literature’. A 
typical sample path of {Rf} is illustrated in Fig. VI.1. 


A 


FIGURE VI.1 T*(u) 


One interpretation of the model is to have continuous expenses and events 
according to a Poisson process (e.g. innovations) which increase the value of the 
portfolio or company. Another interpretation is of course the workload in an 
M/G/1 queue in its first busy period. 

Define the time of ruin 7* (u) = inf {t > 0: Rf < 0}. Using Lundberg conju- 
gation, we shall be able to compute the ruin probability %* (u) = P(r*(u) < co) 
for this model very quickly. A simple sample path comparison will then provide 
us with the ruin probabilities for the renewal model with exponential claim size 
distribution. 


1 Although this terminology is not related to the other duality concepts used in this book. 
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Theorem 2.1 If 3* p+ <1, then w*(u) =1 for allu>0. If B*wp~ > 1, then 
w*(u) = e—™ where y > 0 is the unique solution of 

0 = K*(-7) = B (B* -y -1) +7. (2.1) 
[Note that «* (a) = log Ee~°*" |] 
Proof. Define 


Sž =u- Rš, % = Rš -u=- 57. 


Then {5} is the claim surplus process of a standard compound Poisson risk 
process with parameters 0*, B*. If 6*up« < 1, then by Proposition IV.1.2 


sup S* = —inf S; = œ 
eo ? t>0 


and hence ~*(u) = 1 follows. 


(a) K*(a) (b) K(@) 


FIGURE VI.2 


Assume now 3*1p+« > 1. Then the function «* is defined on the whole of (—oo, 0) 
and has typically the shape on Fig. VI.2(a). Hence y exists and is unique. Let 


B*|-4] 
and let {S;} be a compound Poisson risk process with parameters 3, B. Then the 
c.g.f. of {S+} is K(a) = K*(a— 7), cf. Fig. VI.2(b), and the Lundberg conjugate 
of {S+} is {S+} and vice versa. Define 
Ti) = inf {t >0: &= —u}, T(u) = inf {t >0: St =-u= —u}. 


Since «’(0) < 0, the safety loading of {S+} is > 0. Hence T- (u) < co a.s., and 
thus 


1 = P(r-(u)<œ) = Ele Sew), 7_(u) < oo] 
e™P(F_(u) < œ) = e™y* (u). 


Now return to the renewal model. 
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Theorem 2.2 If B is exponential, with rate ô (say), and dua > 1, then y(u) = 
mt e™ where y > 0 is the unique solution of 


1 = EeV-T) = a ape (2.2) 


and t} = 1— 2 
Proof. We can couple the renewal model {.5;} and the compound Poisson model 
{Sž} with negative claims in such a way that the interarrival times of {9%} 
are Tj, Tf = U1, T% = U2,... Then B* = A, 6* = 6, and (2.1) means that 
6(A[—7] — 1) + y = 0 which is easily seen to be the same as (2.2). 

Now the value of {S/} just before the nth claim is 


Te ID feces U, 


and from Fig. VI.1 it is seen that ruin is equivalent to one of these values being 
> u. Hence 


MS SS Ay ae each a Ue aay 
2 Ty + max {U1 + +Un -Ti e Tha} 
2 =0,1,... ja ý 
2 Të + Mo 


in the notation of Proposition 1.4. 
Taking m.g.f.’s and noting that ~*(u) = P(M* > u) so that Theorem 2.1 
means that M* is exponentially distributed with rate y, we get 


yA os E a 8 


2i 
+ . 
Y —Q 


< Ren ooa O ea 


Le. the distribution of M(® is a mixture of an atom at zero and an exponential 
distribution with rate parameter y with weights 1 — m} and 74, respectively. 
Hence P(M™ > u) = me. 


Remark 2.3 A variant of the last part of the proof, which has the advantage 
of avoiding transforms and leading up to the basic ideas of the study of the 
phase-type case in IX.4 goes as follows: define r} = P(M™ > 0) and consider 
{57} only when the process is at a maximum value. According to Theorem 2.1, 
the failure rate of this process is y. However, alternatively termination occurs 
at a jump time (having rate 6), with the probability that a particular jump time 
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is not followed by any later maximum values being 1 — 7+, and hence the failure 
rate is 0(1 — 7). Putting this equal to y, we see that y = 6(1— 71) and hence 
m4 = 1-7/6. However, consider instead the failure rate of M and decompose 
M) into ladder steps as in III.6, IV.2. The probability that the first ladder 
step is finite is m4}. Furthermore, a ladder step is the overshoot of a claim size, 
hence exponential with rate 6. Thus a ladder step terminates at rate 6 and is 
followed by one more with probability m}. Hence the failure rate of M™ is 
6(1 — 71) = y and consequently P(M™ > u) = P(M® > 0)eT = mye., 


3 Change of measure via exponential families 


We shall discuss two points of view, the imbedded discrete time random walk 
and Markov additive processes. 


3a The imbedded random walk 


The key steps have already been carried out in Corollary II.3.5, which states 
that for a given a, the relevant exponential change of measure corresponds to 
changing the distribution F of Y = U — T to 


FO (a) = eo i : oo pO (dy) 


where 
KV (a) = log FM[a] = log Bla] + log A[—a]. (3.1) 


It only remains to note that this change of measure can be achieved by changing 
the interarrival distribution A and the claim distribution B to AW, resp. Bo, 


where 
—at ax 


A® (at) = £—a(at), BO(dr) = 2—B(az). 
—a Bla] 
This follows since, letting po refer to the renewal risk model with these changed 


parameters, we have 


l 
) 
~ 
= 
= 
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Let 
M(u) = inf{n=1,2,...: Ss >u} 


be the number of claims leading to ruin and 


the overshoot, then we get: 


Proposition 3.1 For any a such that K® (a) > 0, 
plu) = eE Delu tM (u) ® (a), 


Consider now the Lundberg case, i.e. let y > 0 be the solution of k® (y) = 
0. We have the following versions of Lundberg’s inequality and the Cramér- 
Lundberg approximation: 


Theorem 3.2 In the zero-delayed case, 

(a) y(u) < e77"; 

(b) y(u) ~ Ce™ where C = limu (D eE), provided the distribution F 
of U —T is non-lattice. 


Proof. Proposition 3.1 implies 


Bla) = CMB Mere, 


and claim (a) follows immediately from this and (u) > 0 a.s. For claim (b), just 
note that FA) is non-lattice when F is so. This is known to be sufficient for €(0) 
to be non-lattice w.r.t. po) ([APQ, p. 222]) and thereby for €(u) to converge in 
distribution, since pw) (T(0) < co) = 1 because of KM (y) > 0. 


It should be noted that the computation of the Cramér-Lundberg constant 
C is much more complicated for the renewal case than for the compound Poisson 
case where C = (1 — p)/(@B'[y] — 1) is explicit given y. In fact, in the easiest 
non-exponential case where B is phase-type, the evaluation of C is at the same 
level of difficulty as the evaluation of y(u) in matrix-exponential form, cf. IX.4. 


Corollary 3.3 For the delayed case Ti = s, ws(u) ~ C,e7™ where Cs = 
Ces Bly]. For the stationary case, )(u) ~ Ce where 
C(Bly] -1) 


COS ses, 
YHA 
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Proof. Using (1.4), B(x) = 0o(e~7”) and dominated convergence, we get 
uts 
epu) = e™B(uts) + if eV euts =Y) hu + s — y) B(dy) 
0 


= 0 +f eC B(dy) = Cy. 
0 


For the stationary case, another use of dominated convergence combined with 
Ao[s] = (A[s] — 1)/sya yields 


II 


eM) (u) | RROTA J ” ce Bialas 


= CBly](A[-1] - 1) = CG) 
—YHA l 


Of course, a delayed version of Lundberg’s inequality can be obtained in a 
similar manner. The expressions are slightly more complicated and we omit the 
details. 


3b Markov additive representations 


We take the Markov additive point of view of III.5. The underlying Markov 
process {J+} for the Markov additive process {X+} = { (Jų, S+) } can be defined by 
taking J; as the residual time until the next arrival. According to Remark III.4.9, 
we look for a function h(s) and a « (both depending on a) such that Yrg(s,0) = 
k(a)h(s), where Zis the generator of {X+} = {(J:, S:)} and hals, y) = e°h(s). 
Let P,,E, refer to the case Jo = s. For s > 0, 


Usha(Jat, Sat) = h(s—dt)e°" = h(s) — dt(ah(s) + h’(s)) 


so that Gha(s,0) = —ah(s) — h'(s). Equating this to «h(s) and dividing by 
h(s) yields h'(s)/h(s) = —a—k, 


h(s) = etsa) (3.2) 


(normalizing by h(0) = 1). To determine «x, we invoke the behavior at the 
boundary 0. Here 


l= h(0, 0) = bo [hal Jat, Sat) | = teU h(T) 


means 


1 = f eBay) fr) Alas), 
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Biol A[-a — K(a@)] = 1. (3.3) 


As in II.5, we can now for each a define a new probability measure Pa;s 
governing {(J:,.S,)} iso by letting the likelihood ratio L, restricted to A, = 


o((Jy, Sy): 0 <v < t) be 


h = oS: —tn(a) PCO) = east —tK(@) oT latela) (Jes) 


h(s) 
where «(q) is the solution of (3.3). 


Proposition 3.4 The probability measure Pa:s is the probability measure gov- 
erning a renewal risk process with Jo = s and the interarrival distribution A and 
the service time distribution B changed to Aa, resp. Ba where 


e~ (at+K(a))t evr 
A, (dt) = a iÀ Balda) = gP. 


Proof. Pa;s(Jo = s) = 1 follows trivially from Lo = 1. Further, since Jr, = 
Js = To, 


7 Uı+ôT:; 7 Uı+ôT:; 
aise? LEO Ta. a yA [ef it 2 Lr, | 


E E s) sKk(a) 6 (a+K(a))(T2 )] 


Bla + 8] Af- a -— k(a)] 


E ES e ai acy 


which shows that U1, T> are independent with distributions Ba, resp. Aa as 
asserted. An easy extension of the argument shows that U;,...,Un,T2,.--;Tn+41 
are independent with distribution Ag for the Tk and Ba for the Ux. 


Remark 3.5 For the compound Poisson case where A is exponential with rate 
B, (3.3) means 1 = Blal3/(8 + a + k(a)), ie. K(a) = B(Bla] — 1) — a in 
agreement with Chapter IV. 


Note that the changed distributions of A and B are in general not the same 
for Pa;s and pe, An important exception is, however, the determination of the 
adjustment coefficient y where the defining equations «(y) = 0 and «(y) = 0 
are the same. 
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The Markov additive point of view is relevant when studying problems which 
cannot be reduced to the imbedded random walk, say finite horizon ruin proba- 
bilities where the approach via the imbedded random walk yields results on the 
probability of ruin after N claims, not after time T. Using the Markov additive 
approach yields for example the following analogue of Theorem V.4.6: 


Proposition 3.6 Let y < 1/k’(y), let ay > 0 be the solution of k’(ay) = 1/y, 
and define yy = ay — yk(ay). Then 


e~ (Ay +K(ay))s i 
Ws(u, yu) < =e = eT (ute(oy))s Bla je WW, 
A[-ay E K(ay)| 


In particular, for the zero-delayed case W,(u, yu) < e7%™. 


Proof. As in the proof of Theorem V.4.6, it is easily seen that K(a,) > 0. Let 
M(u) be the number of claims leading to ruin. Then J(r(u)) = Tm(u)+1 and 
hence 


h(s) 
Pat; yt) = Eege E ET trees) ; T(u) < yu 


-oyutyur(oy)p 


e 


e~ (Ay +K(ay))s 
Fus E | 


=e lart ley) sew" A, fay + Kloy)], 


IA 


which is the same as the asserted inequality for w,(u, yu). The claim for the 
zero-delayed case follows by integration w.r.t. A(ds). 


Notes and references The approach via the imbedded random walk is stan- 
dard, see e.g. [APQ]. The random walk interpretation also allows to translate genera 


asymptotic results for finite time-horizon ruin probabilities of random walks to the 
corresponding renewal model, for instance the sharp time-horizon asymptotics of Ve- 
raverbeke & Teugels [864]. For the approach via Markov additive processes, see in 
particular Dassios & Embrechts [273] and Asmussen & Rubinstein [99]. 


4 The duality with queueing theory 


We first review some basic facts about the GI/G/1 queue, defined as the single 
server queue with first in first out (FIFO; or FCFS = first come first served) 
queueing discipline and renewal interarrival times. Label the customers 1, 2,... 
and assume that Tn is the time between the arrivals of customers n — 1 and n, 
and U,, the service time of customer n. The actual waiting time W,, of customer 
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n is defined as his time spent in queue (excluding the service time), that is, 
the time from which he arrives to the queue until he starts service. The virtual 
waiting time V; at time t is the residual amount of work at time t, that is, the 
amount of time the server will have to work until the system is empty provided 
no new customers arrive (for this reason often the term workload process is used) 
or, equivalently, the waiting time a customer would have if he arrived at time t. 
Thus, since customer n arrives at time on, we have 


Wr = Vo,— (4.1) 


(left limit). The traffic intensity of the queue is p = EU/ET. 


The following result shows that {W,,} is a Lindley process in the sense of 
II.4: 


Proposition 4.1 Wn+1 = (Wn + Un — T)” : 


Proof. The amount of residual work just before customer n arrives is V,,_. It 
then jumps to V,,_ + Un, whereas in [on,0n41) = [On; On + Tn) the residual 
work decreases linearly until possibly zero is hit, in which case {V;} remains at 
zero until time on41. Thus Vo,,,- = (Wn + Un — Tayta and combining with 
(4.1), the proposition follows. 


Applying Theorem III.3.1, we get: 


Corollary 4.2 Let MI? = maxgqo,....n—1(Ui+---+Ux—-Ti—- Tk). If Wi = 0, 
then Wn 2 yp. 
The next result summarizes the fundamental duality relations between the 


steady-state behavior of the queue and the ruin probabilities (part (a) was es- 
sentially derived already in II.4): 


Proposition 4.3 Assume ņ > 0 or, equivalently, p < 1. Then: 
(a) as n — œ, Wn converges in distribution to a random variable W, and we 
have 

P(W >u) = vw); (4.2) 


(b) as t > œ, V; converges in distribution to a random variable V, and we have 
PV >u) = p)(u). (4.3) 


Proof. Part (a) is contained in Theorem III.3.1 and Corollary III.3.2, but we 
shall present a slightly different proof via the duality result given in Theo- 
rem III.2.1. Let the T there be the random time oy. Then P(7(u) < T) is 
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the probability y) (u) of ruin after at most N claims, and obviously (u) = 
limy oo WO) (u). Also {Z;}9<,<7 evolves like the left-continuous version of the 
virtual waiting time process up to just before the Nth arrival, but interchanging 
the set (T1,..., Tyn) with (Ty,...,71) and similarly for the Un. However, by an 
obvious reversibility argument this does not affect the distribution, and hence 
in particular Zr is distributed as the virtual waiting time just before the Nth 
arrival, i.e. as Wy. It follows that P(Wy > u) = y)(u) has the limit y(u) for 
all u, which implies the convergence in distribution and (4.2). 

For part (b), we let T be deterministic. Then the arrivals of {R+} in [0,7] 
form a stationary renewal process with interarrival distribution A, hence (since 
the residual lifetime at 0 and the age at T have the same distribution, cf. A.le) 
the same is true for the time-reversed point process which is the interarrival 
process for {Z;})<,<7- Thus as before, {Z;})2,<7 has the same distribution as 
the left-continuous version of the virtual waiting time process so that 


PO) (Vp >u) = P) (7(u) < T), (4.4) 
im pls) = i (s) = yl) 
jim POWVr > u) = lim PO (r(u) <T) = vw (u). 


It should be noted that this argument only establishes the convergence in 
distribution subject to certain initial conditions, namely W; = 0 in (a) and Vo = 
0, Ti ~ Ao in (b). In fact, convergence in distribution holds for arbitrary initial 
conditions, but this requires some additional arguments (involving regeneration 
at 0, but not difficult) that we omit. 

Letting n — oo in Corollary 4.2, we obtain: 


Corollary 4.4 The steady-state actual waiting time W has the same distribu- 
tion as M®. 


Corollary 4.5 (LINDLEY’S INTEGRAL EQUATION) Let F(x) = P(U,;—-T; < 2), 
K(x) =P(W < x). Then 


K(x) = T K(a—y)F(dy), «>0. (4.5) 


Proof. Letting n — co in Proposition 4.1, we get W Z (W + U* —T*)t, 
where U*,T* are independent and distributed as U1, resp. Tı. Hence for x > 0, 
conditioning upon U* — T* = y yields 


K(z) = P((W+U*-T*)t <2) = P(W+U*-T* <2) 


= f K@-vFey) 


=09 
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(a > 0 is crucial for the second equality!). 


Now return to the Poisson case. Then the corresponding queue is M/G/1, 
and we get: 


Corollary 4.6 For the M/G/1 queue with p < 1, the actual and the virtual 
waiting time have the same distribution in the steady state. That is, W Zy; 


Proof. For the Poisson case, the zero-delayed and the stationary renewal pro- 
cesses are identical. Hence y(u) = (° (u), implying P(W > u) = P(V > u) for 
all u. 


Notes and references The GI/G/1 queue is a favorite of almost any queueing 
book (see e.g. Cohen [249] or [APQ, Ch. X]), despite the fact that the extension from 
M/G/1 is of equally doubtful relevance as we argued in Section 1 to be the case in risk 
theory. Some early classical papers are Smith [814] and Lindley [598]. 

Note that (4.5) looks like the convolution equation K = F * K but is not the same 
(one would need (4.5) to hold for all x € R and not just x > 0). Equation (4.5) is 
in fact a homogeneous Wiener-Hopf equation, see e.g. Asmussen [66] and references 
therein. 

For some further explicit treatments beyond exponential claim sizes, see e.g. Ma- 
linovski [627] and Rongming & Haifeng [747]. 

The imbedded random walk approach also leads to a Pollaczeck-Khinchine type 
formula in the renewal case, which will be exploited for the asymptotic behavior of the 
ruin probability with heavy-tailed claims in Section X.3. A detailed exposition of the 
compound geometric approach to renewal models is Willmot & Lin [892]. Whenever 
the interclaim time is phase-type, renewal models are a special case of Markov additive 
processes and hence we also refer to Chapter IX for related material. 

Exploiting some links between wave governed random motions and the renewal 
risk process, Mazza & Rullière [632] give an algorithm for computing finite-time ruin 
probabilities for non-exponential interarrival times. 

A number of further results on ruin-related quantities in renewal models will be 
discussed in XII.3. 


Chapter VII 


Risk theory in a Markovian 
environment 


1 Model and examples 


We assume that the arrivals form an inhomogeneous Poisson process, more 
precisely determined by a Markov process {Jt }o<t<oo With a finite state space 
E as follows: E 


e The arrival intensity is 6; when J; = i; 
e Claims arriving when J; = i have distribution B;; 
e The premium rate when J; = 7 is p;. 


Thus, {J:} describes the environmental conditions for the risk process. The 
intensity matrix governing {J+} is denoted by A = (Aij)i jeg and its stationary 
limiting distribution by m; here m exists whenever A is irreducible which is 
assumed throughout, and can be computed as the positive solution of mA = 0, 
me = 1 where e is the E-column-vector of ones. As in Chapter I, {.5;} denotes 
the claim surplus process, 


Nt t 
St = SU -f PJ, dv, 
i=1 0 


and r(u) = inf {t > 0: St > u}, M = sup;>o St- The ruin probabilities with 
initial environment 7 are 


pilu) = P;(7(u) < oo) = P;(M >u), yilu, T) = P; (T(u) <T), 
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where as usual P; refers to the case Jo = i. 

Unless otherwise stated, we shall assume that p; = 1; this is no restric- 
tion when studying infinite horizon ruin probabilities, cf. the operational time 
argument given in Example 1.5 below. 

We let 

pi = Biba, p= >_ Tipi, are (1.1) 
i€E 
Then p; is the average amount of claims received per unit time when the en- 
vironment is in state i, and p is the overall average amount of claims per unit 
time, cf. Proposition 1.11 below. 
An example of how such a mechanism could be relevant in risk theory follows. 


Example 1.1 Consider car insurance, and assume that weather conditions play 
a major role for the occurrence of accidents. For example, we could distinguish 
between normal and icy road conditions, leading to E having two states n,i 
and corresponding arrival intensities 6n, 6; and claim size distributions Bn, Bi; 
one expects that 6; > Bn and presumably also that B, # B;, meaning that 
accidents occurring during icy road conditions lead to claim amounts which are 
different from the normal ones. 


The versatility of the model in terms of incorporating (or at least approxi- 
mating) many phenomena which look very different or more complicated at a 
first sight goes in fact much further (note that for the following discussion a 
basic knowledge of phase-type distributions is needed, cf. IX.1): 


Example 1.2 (ALTERNATING RENEWAL ENVIRONMENT) The model of Exam- 
ple 1.1 implicitly assumes that the sojourn times of the environment in the 
normal and the icy states are exponential, with rates An; and Ain, respectively, 
which is clearly unrealistic. Thus, assume that the sojourn time in the icy 
state has a more general distribution AM. According to Theorem A5.14, we 
can approximate A“ with a phase-type distribution (cf. Example 1.2.4) with 
representation (BY ,a®, T®), say. Assume similarly that the sojourn time in 
the normal state has distribution A‘ which we approximate with a phase-type 
distribution with representation (E™,a(™),T)), say. Then the state space 
for the environment is the disjoint union of E™ and E™, and we have bj = Pi 
when j € B®, Bj = Bn when j € E™); in block-partitioned form, the intensity 
matrix is 
TR Ma® 
Ave ( Ma” TË ) 


where t™ = -TMe, t© = —T%e are the exit rates. 
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Example 1.3 Consider again the alternating renewal model for car insurance 
in Example 1.2, but assume now that the arrival intensity changes during the 
icy period, say it is larger initially. One way to model this would be to take AU 
to be Coxian (cf. Example IX.1.4) with states i1,...,%, (visited in that order) 
and let Bix Se Big. 


Example 1.4 (SEMI-MARKOVIAN ENVIRONMENT) Dependence between the 
length of an icy period and the following normal one (and vice versa) can be 
modelled by semi-Markov structure. This amounts to a family (Am), zH of 
sojourn time distributions, such that a sojourn time of type 7 is followed by one 
of type t W.p. Wy, where W = (wy.)n cH is a transition matrix. Approximating 
each A) by a phase-type distribution with representation (ED, a, T), 
say, the state space E for the environment is {(n,i) : n € H,i € EM}, and 


TY) 4 wy tMaW wota cel wyt a 
fee wt 2) a) TË +4 wota ... wogt Ia 
wat Mad wyt a?) we POL wat al 
where q = |H], = —~T™e. The simplest model for the arrival intensity 


amounts to 3,,; = By depending only on 7. 

In the car insurance example, one could for example have H = {i¢,is,ne,ns}, 
such that the icy period is of two types (long and short) each with their sojourn 
time distribution A“), resp. As), and similarly for the normal period. Then 
for example w;,n, is the probability that a long icy period is followed by a short 
normal one. 


Example 1.5 (MARKOV-MODULATED PREMIUMS) Returning for a short while 
to the case of general premium rates p; depending on the environment i, let 


T 
0(T) = Í PJ, dt, J = Jo-1(4); S= So-1(t)- 
0 


Then (by standard operational time arguments) {Sa} is a risk process in a 
Markovian environment with unit premium rate, and w;(u) = wW;(u). Indeed, 
the parameters are Aij = Aij / Pis bi = Bilpi. 


From now on, we assume again p; = 1 so that the claim surplus is 
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We turn to some more mathematically oriented basic discussion. The key 
property for much of the analysis presented below is the following immediate 
observation: 


Proposition 1.6 The claim surplus process {S+} of a risk process in a Marko- 
vian environment is a Markov additive process corresponding to the parameters 
hi = —pi, 0? = 0, (dz) = B;B; (dx), qij =0 in the notation of Chapter II.4. 


In particular, the Markov additive structure will be used for exponential change 
of measure and thereby versions of Lundberg’s inequality and the Cramér- 
Lundberg approximation. 

Next we note a semi-Markov structure of the arrival process: 


Proposition 1.7 The P;-distribution of T, is phase-type with representation 
(ef, A — (Gi)aiag)- More precisely, 
P; (Tı € dz, Jr, = j) = 8j: eje Coase, de. 


Proof. The result immediately follows by noting that Tı is obtained as the 
lifelength of {J+} killed at the time of the first arrival and that the exit rate 
obviously is 3; in state j. 


A remark which is fundamental for much of the intuition on the model con- 
sists in noting that to each risk process in a Markovian environment, one can 
associate in a natural way a standard Poisson one by averaging over the envi- 
ronment. More precisely, we put 


8 = XO mibi, B* - D Tis, 


icE i€E 


These parameters are the ones which the statistician would estimate if he ignored 
the presence of Markov-modulation: 


Proposition 1.8 Ast — oo, 
N, T 
L r, I(U, < £) ® B*(a). 
f=1 
Note that the last statement of the proposition just means that in the limit, the 


empirical distribution of the claims is B*. Note also that (as the proof shows) 
1;3;/Q* gives the proportion of the claims which are of type i (arrive in state 
Proof. Let t; = Silt (J, = i) ds be the time spent in state i up to time t and NË ô) 
the number of claim arrivals in state i. Then it is standard that t;/t 25 m; as 
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t — oo. However, given {Ji }o<t<oo) We may view NY? as the number of events 
in a Poisson process where the accumulated intensity at time t is @;t;. Hence 


(2) (i) (i) 
N; a.s. N; a.s. Nt — N: a.S. jx 
t bi, t Tibi, a > t Bo 


Also, denoting the sizes of the claims arriving in state i by ul, us, ..., the 
standard law of large numbers yields 


1 4 a.s 
WTR? <2) > Bia), Noo. 
k=1 
Hence 
1 & 1 NÉ ni 
Z (i) t 
Ny o Ursa) = ee DH <a) ~ > ay Bila) 
l=1 i€E k=1 €E 
tribi * 
i€E 


A different interpretation of B* is as the Palm distribution of the claim size, 
cf. Example III.5.4. 

The next result shows that we can think of the averaged compound Poisson 
risk model as the limit of the Markov-modulated one obtained by speeding up 
the Markov-modulation. 


Proposition 1.9 Consider a Markov-modulated risk process {S;} with parame- 
ters Bi, Bi, A, and let {ga refer to the one with parameters Bi, Bi, aA, {9%} 


to the compound Poisson model with parameters 3*, B*. Then fs) Z {Si} 


in D[0, œ) as a > œ. In particular, vy (u) — y* (u) for all u and i. 


Proof. According to Proposition 1.7, the P;-distribution of Tı in {si is 
phase-type with representation (E „e; aA — (Bi)aiag)- By Proposition A5.2, this 
converges to the exponential distribution with rate 8* as a — oo, and further- 
more in the limit Jr, has distribution (m;8:/8*)icg and is independent of Tı. In 
particular, the limiting distribution of the first claim size U; is B*. Condition- 
ing upon Fr, shows similarly that in the limit (T2, U2) are independent of ¥7,, 
with Tə being exponential with rate 8* and U2 having distribution B*. Contin- 
uing in this manner shows that the limiting distribution of (Tr, Un)n=1,2,... is as 
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in {97}. From this the convergence in distribution follows by general facts on 
weak convergence in D[0,00), which also yields 7) (u, T) > W*(u,T) for all u 
and T. The fact that indeed pi (u) — w*(u) follows, e.g., from Theorem 3.2.1 
of [370]. 


Example 1.10 Let 


a —-a 

9 3 2 
Se Boar 
By 5? 1= Es + Er, 

3 1 4 
acs. BS Psi og 
b2 5? 2= pbs + <br, 


where Es denotes the exponential distribution with intensity parameter 6 and 
a > 0 is arbitrary. That is, we may imagine that we have two types of claims 
such that the claim size distributions are E3 and E7. Claims of type Es arrive 


with intensity 3 . 3 E a in state 1 and with intensity 3 . E = Š in state 2, 
those of type Ey with intensity 3- 2 = 2 in state 1 and with intensity 3- $ = £ 


in state 2. Thus, since Eg is a more dangerous claim size distribution than E7 
(the mean is larger and the tail is heavier), state 1 appears as more dangerous 
than state 2, and in fact 


E me a 8 
pl = bu. = $(3°5 5 7) = 
E satt ea AN. ag 
ee bun = 5(5°5 oa) = 


Thus in state 1 where pı > 1, the company even suffers an average loss, and (at 
least when a is small such that state changes of the environment are infrequent), 
the paths of the surplus process will exhibit the type of behavior in Fig. VII.1 
with periods with positive drift alternating with periods with negative drift; the 
overall drift is negative since m = (4 4) so that p = mp, + T2p2 = 2 ei 
On Fig. VII.1, there are p = 2 background states of {J;}, marked by thin, resp. 
thick, lines in the path of {S;}. 

Computing the parameters of the averaged compound Poisson model, we 
first get that 
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FIGURE VII.1 


Thus, a fraction 71, /3* = 3/4 of the claims occurs in state 1 and the remaining 
fraction 1/4 in state 2. Hence 


373 2 wal 4 1 1 
Bt = (ŻE Ex) (E Er) = 3E Sin 
Aare Be akg ee ae aa 


That is, the averaged compound Poisson model is the same as in IV.(3.1). 


The definition (1.1) of the safety loading 7 is (as for the renewal model in 
Chapter VI) based upon an asymptotic consideration given by the following 
result: 


Proposition 1.11 (a) ES;/t — p — 1, t > oo; 
(b) S:/t — p—1 a.s., t > œ. 


Proof. In the notation of Proposition 1.8, we have 


E[S:+¢|(t)ien] = So tilius, = X tipi. 


tek 1EB 


Taking expectations and using the well-known fact Et;/t — m; yields (a). For 
(b), note first that <7 UË /N 2$ up, Hence 


No 


yy k XO mbi uB, = p. 


(4) 
i€E t N; k=1 i€E 


Sp+t NO ı 
aA 
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Corollary 1.12 If <0, then M = œ a.s., and hence w;(u) = 1 for alli and 
u. Ifn >0, then M < œ a.s., and w;(u) <1 for alli and u. 


Proof. The case 7 < 0 is trivial since then the a.s. limit p—1 of S;/t is > 0, and 
hence M = œ. The case 7 > 0 is similarly easy. Now let 7 = 0, let some state 
i be fixed and define 


w=w,=inf{t>0:A4- Zi, lh =i}, wo = inf {t > wi: A #1, ~h =i}, 
Ar D Aa = Su. — Sun 


and so on. Then by standard Markov process formulas (e.g. [APQ, Th. II.4.2(i), 
p. 50]) gy = —1/Tiiü and 


Ww 
ix. = zx | BJ HB;, dt — Ew 
0 
= Ew. es 153; UB; — 1) = (p—lEw = 0. 
jEE 
Now obviously the wn form a renewal process, and hence wn/n Ss tw. Since 


the Xn are independent, with X2, X3,... having the P;-distribution of X, also 


a X ID ANTEA 
Dele he EY uX = 0. 


n n 


Thus {,,,, } is a discrete time random walk with mean zero, and hence oscillates 
between —oo and œ so that also here M = oo. 


Notes and references The Markov-modulated Poisson process has been very 
popular in queueing theory since the early 1980s, see the Notes to Section 7. In risk 
theory, some early studies are in Janssen & Reinhard [501, 730, 502], and a more 
comprehensive treatment in Asmussen [58]. The mainstream of the present chapter 
follows [58], with some important improvements being obtained in Asmussen [59] in 
the queueing setting and being implemented numerically in Asmussen & Rolski [97]. 

Statistical aspects are not treated here. See Meier [634] and Rydén [760, 761]. 
There seems still to be more to be done in this area, in particular in order to treat 
more than low-dimensional state spaces F. 

Proposition 1.1 and the Corollary are standard. The proof of Proposition 1.1(b) 
is essentially the same as the proof of the strong law of large numbers for cumulative 
processes, see [APQ, p. 178] or A.1d. 


2 The ladder height distribution 


Our mathematical treatment of the ruin problem follows the model of Chap- 
ter IV for the simple compound Poisson model, and involves a version of the 
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Pollaczeck-Khinchine formula (see Proposition 2.2(a) below) where the ladder 
height distribution is evaluated by a time reversion argument. 
Define the ladder epoch T} by T+ = inf {t: S; > 0} = 7(0), let 


G4(i, j; A) = P(S., € A, Jr, = 39,74 < 00) 


and let G4} be the measure-valued matrix with ijth element G4 (i, j;-). The 
form of G turns out to be explicit (or at least computable), but is substantially 
more involved than for the compound Poisson case. However, by specializing 
results for general stationary risk processes (Theorem III.5.5; see also Example 
III.5.4) we obtain the following result, which represents a nice simplified form 
of the ladder height distribution Œ, when taking certain averages: starting 
{Ji} stationary, we get the same ladder height distribution as for the averaged 
compound Poisson model, cf. the definition of 6*, B* in Section 1. 


Proposition 2.1 ™G,(dy)e = 8*B (y)dy. 


For measure-valued matrices, we define the convolution operation by the 
same rule as for multiplication of real-valued matrices, only with the product of 
real numbers replaced by convolution of measures. Thus, e.g., GY is the matrix 
whose ijth element is 

do G4, k; +) * G4 (k, 55). 
keE 
Also, ||G || denotes the matrix with ijth element 


[G+ 35] - | G4 (i, j; dx). 


Let further R denote the pre-T} occupation kernel, 


T+ 
R(i,j; A) = E; J I(S: € A, Je = j) dt, 
0 


and S(da) the measure-valued diagonal matrix with @;B;(dx) as ith diagonal 
element. 


Proposition 2.2 (a) The distribution of M is given by 


1- y(u) = P(M<u) = e] X G(u)(I- ||G+ll)e. (2.1) 
0 “ 
(b) Gi(y,co) = a R(dx)S((y—2x,0o)). That is, for i,j € E, 


0 KEN 
E E / R(i, jd) 8/B;(y—2). (2.2) 
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Proof. The probability that there are n proper ladder steps not exceeding x and 
that the environment is j at the nth when we start from i is e] G’"(x)e;, and 


the probability that there are no further ladder steps starting from environment 
j is ej (I — ||G4||)e. From this (2.1) follows by summing over n and j. The 


proof of (2.2) is just the same as the proof of Lemma III.5.3. 


To make Proposition 2.2 useful, we need as in Chapters III, IV to bring R and 
G, on a more explicit form. To this end, we need to invoke the time-reversed 
version {J;‘} of {J+}; the intensity matrix A* has ijth element 


Ti 


Mg = GA 
and we have 
A . Tj > 
PJ =j) = P,(Jr =i). (2.3) 


We let {Sj} be defined as {S;}, only with {J+} replaced by {Jj} (the 6; and 
B; are the same), and let further {mzs} be the E-valued process obtained by 
observing {J} only when {57} is at a minimum value. That is, Ms = j when 
for some (necessarily unique) t we have Sf = —2, Jf = j, Sf < S% for u < t; 
see Figure VII.2 for an illustration in the case of p = 2 environmental states of 
{Ji}, marked by thin, resp. thick, lines in the path of {S4}. 


FIGURE VII.2 


The following observation is immediate: 


Proposition 2.3 When 7 > 0, {Mz} is a non-terminating Markov process on 
E, hence uniquely specified by its intensity matrix Q (say). 
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Proposition 2.4 Q satisfies the non-linear matrix equation Q = (Q) where 


AQ) = A- (Bane + | 7 g(az) e2", 


and S(dx) is the diagonal matrix with the BiBi(dx) on the diagonal. Further- 
more, the sequence {Q} defined by 


QO =A" — (Biaiag, QCH = 9(Q™) 
converges monotonically to Q. 


Note that the integral in the definition of y(Q) is the matrix whose ith row is 
the ith row of 


Proof. The argument relies on an interpretation in terms of excursions. An 
excursion of {.57} above level —x starts at time tif Sf_ = —2, {9%} } is a minimum 
value at v = t— and a jump (claim arrival) occurs at time t, and the excursion 
ends at time s = inf {v >t: Sž = —x}. If there are no jumps in (t, s], we say 
that the excursion has depth 0. Otherwise each jump at a minimum level during 
the excursion starts a subexcursion, and the excursion is said to have depth 1 
if each of these subexcursions have depth 0. In general, we recursively define 
the depth of an excursion as 1 plus the maximal depth of a subexcursion. The 
definitions are illustrated on Figure VII.3 where there are three excursions of 
depth 1,0,2. For example the excursion of depth 2 has one subexcursion which 
is of depth 1, corresponding to two subexcursions of depth 0. 


FIGURE VII.3 
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Let pe) be the probability that an excursion starting from Jš = i has depth 
at most n and terminates at J} = j and p,; the probability that an excursion 
starting from Jě = i terminates at Jš = j. By considering minimum values 


within the excursion, it becomes clear that 
pe: if [e2], Bi(dy) (2.4) 


To show Q = (Q), we first compute qij for i A j. Suppose My = i. Then 
a jump to j (i.e., Mraz = j) occurs in two ways, either due to a jump of {J} 
which occurs with intensity Aj;, or through an arrival starting an excursion 
terminating with J} = j. It follows that qi; = Aj; + Gipij. Similarly, 


implies qii = At, — 6; + bipi. Writing out in matrix notation, Q = p(Q) follows. 

Now let {mY} be {mz} killed at the first time nn (say) a subexcursion of 
depth at least n occurs. It is clear that {mi} is a terminating Markov process 
and that {mO} has subintensity matrix A* — (i)diag = QO, The proof of 
Q = p(Q) then immediately carries over to show that the subintensity matrix 
of {mi} is 9(Q) = Qo. Similarly by induction, the subintensity matrix of 
{mE TH} is o(Q™) = Q* which implies that 


Get) = Xr, bi t Bip. 


Now just note that pe T pij and insert (2.4). 


Define a further kernel U by 
U(i,j; A) = J. Pimz= j)dz = i ej e@%e, dx (2.5) 
—A —A 


(note that we use —A = {x : —x € A} on the r.h.s. of the definition to make U 
be concentrated on (—oo,0)). 


Theorem 2.5 R(i,7;A) = "Uj, i A). 
Ti 
Proof. We shall show that 


Pi(Je = j, St € A, T} >t) = TEP (Jf = i, 3} € A, Sf < Siu <t), (2.6) 


t 
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from which the result immediately follows by integrating from 0 to oo w.r.t. dt. 
To this end, consider stationary versions of {J;}, {J}. We may then assume 
Ji = Jt-u, Se = St — St-u, O < u < t, and get 
miPi (J =; S+ € A, T4 > t) 
= P(t = j, Jo =i, Si € A, Su < 0,0 <u <t) 
= Pr(J =j, Jf =i, Sf EA, Sf < Siuu t) 
njP; (Jf = i, Sf € A, Sf < SZ,0<u<t), 


II 


and this immediately yields (2.6). 


It is convenient at this stage to rewrite the above results in terms of the 
matrix K = A~'Q'A, where A is the diagonal matrix with a on the diagonal: 


Corollary 2.6 (a) R(dz) = e~**dz, x < 0; 
(b) for z > 0, Gy((z,00)) = fo e**S((x + z,00)) da; 
(c) the matrix K satisfies the non-linear matriz equation K = (K) where 


p(k) = A- (Bi) diag +f ere S(dz); 
0 
(d) the sequence {K™} defined by K® = A — (Bi) diag, KM) = y(K™) 


converges monotonically to K. 


[The y(-) here is of course not the same as in Proposition 2.4.] 

From Qe = 0, it is readily checked that m is a left eigenvector of K corre- 
sponding to the eigenvalue 0 (when p < 1), and we let k be the corresponding 
right eigenvector normalized by mk = 1. 


Remark 2.7 It is instructive to see how Proposition 2.1 can be rederived using 
the more detailed form of G4 in Corollary 2.6(b): from mK = 0 we get 


mGi(dyje = i me** (3;B;(dy + 2))diag dx - € 
0 


= m PATERET 
0 
>> iG: Bily) dy = BB (y)dy. 


1€E 


II 


Though maybe Corollary 2.6 is hardly all that explicit in general, we shall 
see that nevertheless we have enough information to derive, e.g., the Cramér- 
Lundberg approximation (Section 3), and to obtain a simple solution in the 
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special case of phase-type claims (Chapter IX). As preparation, we shall give 
at this place some simple consequences of Corollary 2.6. 


Lemma 2.8 (I — ||G4||)e = (1—p)k. 
Proof. Using Corollary 2.6(b) with z = 0, we get 
Gx = [eS ((e,00)) ae. (2.7) 
In particular, multiplying by K and integrating by parts yields 
KG = fe — s(x) 
= K-A+(G)ang- f° Sdo) = K-A (2.8) 


Let L = (kr — K)~'. Then (kr — K)k = k implies Lk = k. Now using (2.7), 
(2.8) and mee = T, we get 


kr||Gi|le = r | mS ((x,00))edz = k(TißilB,)rowe = pk, 
0 
K||Gille = Ke, 


(kr — K)(I-||Gy||)e = k-Ke-pk+Ke = (1—p)k. 


Multiplying by L to the left, the proof is complete. 


Here is an alternative algorithm to the iteration scheme in Corollary 2.6 
for computing K. Let |A| denote the determinant of the matrix A and d the 
number of states in E. 


Proposition 2.9 The following assertions are equivalent: 
(a) all d eigenvalues of K are distinct; 
(b) there exist d distinct solutions s1,...,8a E€ {s E€ C : R(s) < 0} of 


|a + (8;(Bils] — 1))aiag — s1| = 0. (2.9) 
In that case, then s1,...,5q are precisely the eigenvalues of K, and the corre- 
sponding left row eigenvectors aj,...,@q can be computed by 


a; (A — (8:(Bilsi] - 1) ging ~ si) =0. (2.10) 
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Thus, 
a, a S141 
K = : : : (2.11) 
aq SqQd 
Proof. Since K is similar to the subintensity matrix Q, all eigenvalues must 
indeed be in {s € C: R(s) < 0}. 
Assume aK = sa. Then multiplying K = (K) by a to the left, we get 


sa = a(A~ (Bast | 0°*8(dx)) = a(A = (B:)aing + (Pileas). 


It follows that if (a) holds, then so does (b), and the eigenvalues and eigenvectors 
can be computed as asserted. 

The proof that (b) implies (a) is more involved and omitted; see Asmussen 
[58]. 


In the computation of the Cramér-Lundberg constant C, we shall also need 
some formulas which are only valid if p > 1 instead of (as up to now) p < 1. 
Let M, denote the matrix with ijth entry 


MGD = eGs(i,jida). 


Lemma 2.10 Assume p > 1. Then ||G4|| is stochastic with invariant probabil- 
ity vector Ç, (say) proportional to -n K, ¢, = -nr K/(—-r Ke). Furthermore, 


-rKM„,e = p-l1. 


Proof. From p > 1 it follows that S; “5 oo and hence ||G4]| is stochastic. 


That -m K = —e' Q'A is non-zero and has non-negative components follows 
since —Qe has the same property for p > 1. Thus the formula for ¢, follows 
immediately by multiplying (2.8) by —7, which yields -r K||G4|| = -r K. 
Further 
M, = J dz e** S((x + z,00)) dz 
0 0 
oo y 
= J dy | e** da S((y,co)) 
0 0 
= K` | (e®¥—I) S((y,00)) dy, 
0 
-nKM,e = a dy(I — e”) S((y,0o))e 
0 


= m(Gipp,)diage — 7||Gy||le = p—1 
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(since ||G,|| being stochastic implies ||G+||e = e). 


Notes and references The exposition follows Asmussen [59] closely (the proof 
of Proposition 2.4 is different). The problem of computing G+ may be viewed as 
a special case of Wiener-Hopf factorization for continuous-time random walks with 
Markov-dependent increments (Markov additive processes); the discrete-time case is 
surveyed in Asmussen [57] and references given therein. 


3 Change of measure via exponential families 


We first recall some notation and some results which were given in Chapter II 
in a more general Markov additive process context. Define F; as the measure- 
valued matrix with ijth entry Fili, j; x) = P,[S; < x; J, = j], and F;[s] as the 
matrix with ijth entry F; li, j; s| = E,[e**; J, = j] (thus, F's] may be viewed as 
the matrix m.g.f. of F defined by entrywise integration). Define further 


K[a] = A+ (B(Bila]-1))  —at 
iag 
(the matrix function K/a] is of course not related to the matrix K of the 
preceding section). Then (Proposition III.4.2): 


Proposition 3.1 F,[a] = e'* le, 


It follows from II.4 that K[a] has a simple and unique eigenvalue (a) 
with maximal real part, such that the corresponding left and right eigenvectors 
po), hi may be taken with strictly positive components. We shall use the 
normalization ve = vA‘ = 1. Note that since K[0| = A, we have v) = 
a, h =e. The function k(a) plays the role of an appropriate generalization 
ol the c.g.f., see Theorem III.4.7. 

Now consider some @ such that all B; [Ø] and hence «(@), vO, A ete. 
are well-defined. The aim is to define governing parameters g:i, Boi, Ae = 
OM), jee for a risk process, such that one can obtain suitable generalizations 
of the likelihood ratio identitites of Chapter III and thereby of Lundberg’s in- 
equality, the Cramér-Lundberg approximation etc. 

According to Theorem III.4.11, the appropriate choice is 


Ox 


Pi e 
Be; = Bl], Box(dz) = Boo 


i 


Ao A7 'K[0] Ao — K(0)I 


= A;'AAo + (6: (Bl0) - 1) 


dF («(0) + O)I 
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(0) 


where Ag is the diagonal matrix with h; as ith diagonal element. That is, 


p® 
j nee 
a = rij FO] TE. 
da + Bi(B,[6] — 1) — K(0)— 0 i=j 


We recall that it was shown in II.4 that Ag is an intensity matrix, that nel HRO 
= oth (9) pO and that {eisai OT is a martingale. 

We let Po, be the governing probability measure for a risk process with 
o is 
the restriction of Po;; to Fr = o{ (Si, Jats T} and pi?) = po, then Pee 
and pi?) are equivalent for T < oo. More generally, allowing T to be a stopping 
time, Theorem III.1.3 takes the following form: 


parameters (9.;, Bo, Ag and initial environment Jo = i. Recall that if P 


Proposition 3.2 Let r be any stopping time and let G E€ ¥,, G C {rT < oo}. 
Then 


1 
P;G = PouiG — ni?) Lo:i O a G|. (3.1) 
Jr 


Let Foals], Kko(s) and pọ be defined the same way as F’,[s], «(s) and p, only 
with the original risk process replaced by the one with changed parameters. 


Lemma 3.3 F9.;[s] = eO A7! Fy [s + OA. 


Proof. Use III.(4.5). 


Lemma 3.4 kols) = K(s+0)—«(0). In particular, pg > 1 whenever «K'(s) > 0. 


Proof. The first formula follows by Lemma 3.3 and the second from pg = K(s). 


Notes and references The exposition here and in the next two subsections (on 
likelihood ratio identities and Lundberg conjugation) follows Asmussen [58] closely 
(but is somewhat more self-contained). 


3a Lundberg conjugation 


Since the definition of «(s) is a direct extension of the definition for the Cramér- 
Lundberg model, the Lundberg equation is «(y) = 0. We assume that a solution 
y > 0 exists and use notation like Pz.; instead of P,.;; also, for brevity we write 
h=h™ and v =v, 
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Substituting 0 = y, T = T(u), G = {r(u) < co} in Proposition 3.2, letting 
E(u) = Sr(u) — u be the overshoot and noting that Pz.;(7(u) < œ) = 1 by 
Lemma 3.4, we obtain: 


Corollary 3.5 


—yE(u) 
pilu T) = he ral net (3.2) 

Adc) 

e7 V(u) 
pilu) = h; li pa (3.3) 

Tr(a) 

Noting that €(u) > 0, (3.3) yields 
hi 


Corollary 3.6 (LUNDBERG’S INEQUALITY) wj(u) < e77, 


T minjeg hj 


Assuming it has been shown that C = limy— opale E /ha, 45] exists and is 
independent of i (which is not too difficult, cf. the proof of Lemma 3.8 below), 
it also follows immediately that ¢;(u) ~ h;Ce~?™. However, the calculation of 
C is non-trivial. Recall the definition of G}, K, k from Section 2. 


Theorem 3.7 (THE CRAMER-LUNDBERG APPROXIMATION) In the light-tailed 
case, pilu) ~ hyCe~™, where 
l—p 
C = ——vk. (3.4) 
(PL -1) 
To calculate C, we need two lemmas. For the first, recall the definition of 
C4, M, in Lemma 2.10. 


Lemma 3.8 As u > œ, (E(u), J(u) converges in distribution w.r.t. Pri, with 


the density g;(a) (say) of the limit (E(00), J-(c0)) at E(00) = z, Jp(o0) = j being 
independent of i and given by 


; Lith ( ys 
93(z) = TEE LG Gy (x, )). 


Proof. We shall need to invoke the concept of semi-regeneration, see A.1f. In- 
terpreting the ladder points as semi-regeneration points (the types being the 
environmental states in which they occur), { (E(w), Jr (u)) )} is semi-regenerative 
with the first semi-regeneration point being (€(0), J-(0)) = ($7,, J7,). The for- 
mula for gj(x) now follows immediately from Proposition A1.7, jotas that the 
non-lattice property is obvious because all Gh (¢,7;-) have densities. 
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Lemma 3.9 K? = AKA —7I, Gi[-7y)= ATG,||A, Ĝ,h]h =h. 


Proof. Appealing to the occupation measure interpretation of K, cf. Corol- 
lary 2.6, we get for x < 0 that 


ele Ke, dz = f P; (S: E€ dz, Ji = j, T4 > t) dt 
0 
ee a l 
= tiere f Pra (Si E€ dz, Ji = j, T4 > t) dt 
hj 0 
— Fi tele Kig dz, 
hj 


which is equivalent to the first statement of the lemma. The proof of the second 
is a similar but easier application of the basic likelihood ratio identity T roposi- 
tion 3.2. In the same way we get G,[y] = A||G? |A}, and since Ge lle = =e, 
it follows that 


Giblh = AGL Ath = AllGklle = Ae = h. 


Proof of Theorem 3.7. Using Lemma 3.8, we get 


UT, le 76(00) J+(c0) = jl => f e793 (a) dx 
0 


esa. A —yx rL - 
= Gay pie G eGo (£ j; (x, 00)) dz 
Ç M € cE 0 
1 w —yz\ nL : 
= -r DG | =a -e CE, j; da) 
64M ie ice o 7 
1 Lie L AL 
= rp IGE DI - 7E j; =). 
wyMže LEE 


In matrix formulation, this means that 


G ee 1 eE (ez —@ aa 
= “L;i = ~ Y e 
AS, (20) MRE 4 
1 
= CTG AAE 
ao ) 
1 


= —— (=n K*)(I- Gi [-7)) Ae, 
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using Lemma 2.10 for the last two equalities. Inserting first Lemma 3.9 and 
next Lemma 2.8, this becomes 


1 LA-l 
—— nA (yI -— K) (I - ||Gy |e 
Spry ATOI- KU- Gl) 
1—p Tegel ATR. cee 
= ——7nA I-K)k = — —7'A k. 
‘bE 1) si ) (PL -1) 


Thus, to complete the proof it only remains to check that mt = vA. The 
normalization vh = 1 ensures vAe = 1. Finally, 


vAA;, =vAA'K[yJA=0 


since by definition vK|y] = k(y)v = 0. 


3b Ramifications of Lundberg’s inequality 


We consider first the time-dependent version of Lundberg’s inequality, cf. V.4. 
The idea is as there to substitute T = yu in y;(u, T) and to replace the Lundberg 
exponent y by yy = Qy — yk(a,), where a, is the unique solution of 


Kk (Qy) = 3 (3.5) 
y 


Graphically, the situation is just as in Fig. V.1. Thus, one always has yy > y, 
whereas ay > 7, k(ay) > 0 when y < 1/k’(y), and ay < y, K(ay) < 0 when 
y > 1/r'(9). 


1 
Theorem 3.10 Let Cy) a Then 
minjegh; ” 

a 1 
wilu yu) < CPhI», y<, (3.6) 

K (7) 

Q,) — 1 
yilu) —di(u,yu) < COn, y> (3.7) 
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Proof. Consider first the case y < 1/k’(7). Then, since k(a,) > 0, (3.1) yields 


Wi(u, yu) 
pio? 


1 
ani [zar P -0v8 + relay); rlu) < ve 
Tru) 


a Ayu 1 
= hi v) oTa ass ET exp {—a,&(u) + T(u)K(ay)}; T(u) < yu] 
Ir(u) 


< ACO (ye Ea [e7 RW); (u) < yu] 


< nwa 


0) (ye tvu tyuna) ; 


Similarly, if y > 1/K’(y), we have K(a,) < 0 and get 


a 1 
= hi Were aut | ta j exp {—a,€(u) + T(u)K(ay)}; yu < T(u) < œ 
Tx (u) 
< nw CO y)e ayu A yu < T(u) < oo] 
< pov ® (ye oust yur(oy) 


Note that the proof appears to use less information than is inherent in the 
definition (3.5). However, as in the classical case (3.5) will produce the maximal 
Yy for which the argument works. 

Our next objective is to improve upon the constant in front of e~7 in Lund- 
berg’s inequality as well as to supplement with a lower bound: 


Theorem 3.11 Let 


CL = min—.- inf ~~ Bi) ; 
jEE hj x>0 J; e7%—-2) B;(dy) 
1 B;(x) 
C, = . 55 a ; 3.8 
+ T GeF hj s20 fev”) B;(dy) a 
Then for alli € E and all u > 0, 
C_hje 7” < pilu) < Cyhie ™. (3.9) 


For the proof, we shall need the matrices G4} and R of Section 2. We further 
write G(u) for the vector with ith component Gi(u) = )ijen G+ (i, j; (u, 00) 
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and, for a vector y(u) = (yi(u))iex of functions, we let G4 * y(u) be the vector 
with ith component 


NO (G4, j) * 95) DA pjlu — y)G4 (i, j; dy). 


jEE jEE 


Lemma 3.12 Assume sup; u |p (u)| < œ, and define p'"*))(u) = Glu) + 
(G4 x p™)(u). Then p™ (u) > yilu) as n > o. 


Proof. Write Un = X0 G3”, U =U = > GX”. Then iterating the defining 
equation p+) = G + G, «yp we get 


pa = Un «Gt Girt) xo, 


However, if T4 (n) is the nth ladder epoch, we have 


eo xyp®| (u) < supp (u)|P; (m4(N+1)<œ) > 0. 


Hence lim gy exists and equals U * G. 


To see that the ith component of U x G(u) equals Y;(u), just note that the 
recursion yt) = G + G} * yp holds for the particular case where oh” (u) 


is the probability of ruin after at most n ladder steps and that then obviously 
gt" (u) = Pilu), n > o. 


Lemma 3.13 For alli and u, 


E. e0=WG (i, j;dy) < Ja Cy Soh; A eTY-4 G (i, j; dy). 
jEE u jEE u 


Proof. According to (2.2), 


0 
CAEN ER i, Bay DRG gan. 
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Thus 
C4 y hy f ete Gs (i, j; dy) 
jEE a 
0 ee) 
= Y bs f RG, jde) | eV) B; (dy — 2) 
jeE —oo u 
E feos egy 
= C Y Ashy f Rl, sde) Baas B;(u — x) 
jEE 


IV 


0 
Ea | Rajdu- 2) = Glu), 


jEE B 


proving the upper inequality, and the proof of the lower one is similar. 


Proof of Theorem 3.11. Let first go (u) = C_hje~™ in Lemma 3.13. We claim 


by induction that then p™ (u) > C_-h;e7? for all n, from which the lower 
inequality follows by letting n — oo. Indeed, this is obvious if n = 0, and 
assuming it shown for n, we get 


eu) = Guat a gy” (u — y)G4 (i, ji dy) (3.10) 


jEE 


> ay | hye" G (i, j; dy) 
JERY 
+0- f hye (i j;dy) 
jeEY0 
= Ce Â ijih = Ceh, (8.11) 
jEE 


estimating the first term in (3.10) by Lemma 3.13 and the second by the induc- 
tion hypothesis, and using Lemma 3.9 for the last equality in (3.11). 


The proof of the upper inequality is similar, taking gy) (u) = 0. 


Here is an estimate of the rate of convergence of the finite horizon ruin prob- 
abilities y;(u, T) = P;(T(u) < T) to y;(u) which is different from Theorem 3.10: 


Theorem 3.14 Let yo > 0 be the solution of K'(yo) = 0, let Cy(yo) be as in 
(3.8) with y replaced by yo and hi by A), and let 5 = e0), Then 


0 < qilu) — pilu, T) < Cy (qo)hPMe~W*5T. (3.12) 
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Proof. We first note that just as in the proof of Theorem 3.11, it follows that 


pilu) < C_(qo)hf{e-™. (3.13) 


t 


Hence, letting Mr = maxo<t<r St, we have 


pilu) — pilu, T) = P;(M > u)-—P;(Mr >u) = Pi(Mr < u,M >u) 
= Pi(Sp < u, Mr < u, M >u) 
= E;lys(u-— Sr); Mr <u, Sr <ul 
< C(e "E; [AW err] 


Toon 


Notes and references The results and proofs are from Asmussen & Rolski [98]. 
Further related discussion is given in Grigelionis [435, 436]. 

Jasiulewicz [503] uses an integral equation approach to study the ruin probability 
in a Markov-modulated model with surplus-dependent premium rates, for approaches 
involving systems of IDEs see Siegl & Tichy [805] and Lu & Li [610]. For moments of 
discounted aggregate claims, see Kim & Kim [532]. Yin, Liu & Yang [906] deal with 
effects of state-space reduction of J; on Lundberg-type bounds for the ruin probability. 
Zhu & Yang [921] investigate general regularity issues for ruin-related functions in 
a Markovian environment. For the stability of ruin probabilities w.r.t. parameter 
changes, see Enikeeva, Kalashnikov & Rusaityte [355]. Discrete-time models with 
Markovian environment are e.g. studied in Reinhard & Snoussi [731] and Wagner 
[866]. 


4 Comparisons with the compound Poisson mo- 
del 


4a Ordering of the ruin functions 


For two risk functions Y’, Y”, we define the stochastic ordering by Y’ <,, Y” if 
yp'(u) < Y” (u), u > 0. (4.1) 


Obviously, this corresponds to the usual stochastic ordering of the maxima 
M', M” of the corresponding two claim surplus processes (note that ~’(u) = 
P(M' > u), Y” (u) = P(M” > u)). 

Now consider the risk process in a Markovian environment and define Y; (u) = 
X icg MVi(u). It was long conjectured that Y* <st Yn, where ~*(u) is the ruin 
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probability for the averaged compound Poisson model defined in Section 1 and 
Wr is the one for the Markov-modulated one in the stationary case (the distri- 
bution of Jo is m). The motivation that such a result should be true came in 
part from numerical studies, in part from the folklore principle that any added 
stochastic variation increases the risk, and finally in part from queueing theory, 
where it has been observed repeatedly that Markov-modulation increases wait- 
ing times and in fact some partial results had been obtained. The results to 
be presented show that quite often this is so, but that in general the picture is 
more diverse. 
The conditions which play a role in the following are: 


By < Bo... < Bp. (4.2) 
By xst Bo xst ... <st Bp- (4.3) 
The Markov process {J,} is stochastically monotone. (4.4) 


To avoid trivialities, we also assume that there exist i Æ j such that either 
Bi < B; or Bi # B;. Occasionally we strengthen (4.3) to 


B = B; does not depend on i. (4.5) 


Note that whereas (4.2) alone just amounts to an ordering of the states, this is 
not the case for (4.3). For the notion of monotone Markov processes, we refer to 
Müller & Stoyan [653]; note that (4.4) is automatic in some simple examples like 
birth-death processes or p = 2. Conditions (4.2)—(4.4) say basically that if i < j, 
then j is the more risky state, and it is in fact easy to show that y(u) < y;(u) 
(this is used in the derivation of (4.9) below). 


Theorem 4.1 Assume that conditions (4.2)-(4.4) hold. Then Y* ~s, Yr. 


For the proof, we need two lemmas. The first is a standard result going 
back to Chebycheff and appearing in a more general form in Esary, Proschan & 
Walkup [357], the second follows from an extension of Theorem III.5.5 (cf. also 
Proposition 2.1) which with basically the same proof can be found in Asmussen 
& Schmidt [103]. 


Lemma 4.2 If a; < ... < ap, bı < ... < bp and T; > 0 (i = 1,...,p), 
yim; = 1, then 


P P P 
So maid; = X mies Y Tbj. 
i=1 i 


The equality holds if and only if a, = ... = ap or bı =... = bp. 


190 CHAPTER VII. MARKOVIAN ENVIRONMENT 


Lemma 4.3 (a) Px (Jz) = i, T(0) < œ) = on), where alt = Bp, Ti /p; 
(b) Pr (S0) € dx | J-(0) =1, T(0) < 00) ze B,(z) da/pp,. 


Proof of Theorem 4.1. Conditioning upon the first ladder epoch, we obtain (cf. 
Proposition 2.1 for the first term in (4.7) and Lemma 4.3 for the second) 


yu) = PEF ute ix ip" (u — 2) (a) de, (4.6) 


Yalu) = oF) +o oa” f w vu aBn de (47) 


= B* B* (u) + iG ymin Bi i x); (u—2«)d T (4.8) 


6* B* (u) + [ J 7 0;B;(x) - D mihilu —x)dxz (4.9) 


= 6*B*(u) co f H £)Yrlu— x)dz. (4.10) 


IV 


Here (4.9) follows by considering the increasing functions 3;B;(x) and ypi(u— x) 
of i and using Lemma 4.2. Comparing (4.10) and (4.6), it follows by a standard 
argument from renewal theory that Y» dominates the solution %* to the renewal 
equation (4.6). 


Here is a counterexample showing that the inequality W*(u) < Yr(u) is not 
in general true: 


Proposition 4.4 Assume that Biui < 1 for alli, that 
Yon ee Yomi eee (4.11) 
i=l‘ 


and that A has the form eAo for some fixed intensity matrix Ag. Then w*(u) < 
Yru) fails for all sufficiently small € > 0. 


Proof. Since Wx(0) = y*(0), it is sufficient to show that y} (0) < w*'(0) for e 
small enough. Using (4.6), (4.8) we get 


Y” (0) = =0* + By = Dns Saas - ø, 


$=] 


ph) = > TiBibi(0) =p 
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But it is intuitively clear (see Theorem 3.2.1 of [370] for a formal proof) that 
pilu) converges to the ruin probability for the compound Poisson model with 
parameters ĝ;, B; as e | 0. For u = 0, this ruin probability is G;u,,, and from 
this the claim follows. 


To see that Proposition 4.4 is not vacuous, let 
m = (1/2 1/2), B =10, 6&2=1, ug, =10°, up, 0 


Then the 1.h.s. of (4.11) is of order 1074 and the r.h.s. of order 1071. 


Notes and references The results are from Asmussen, Frey, Rolski & Schmidt 
[78]. As is seen, they are at present not quite complete. What is missing in relation 
to Theorem 4.1 and Proposition 4.4 is the understanding of whether the stochastic 
monotonicity condition (4.4) is essential (the present authors conjecture it is). 


4b Ordering of adjustment coefficients 


Despite the fact that ~*(u) < Yr(u) may fail for some u, it will hold for all 
sufficiently large u, except possibly for a very special situation. Recall that 
the adjustment coefficient for the Markov-modulated model is defined as the 
solution y > 0 of K(y) = 0 where x(a) is the eigenvalue with maximal real part 
of the matrix A + (K;(@))diag where K;(a) = 3; (B; [a] — 1)—a. The adjustment 
coefficient y* for the averaged compound Poisson model is the solution > 0 of 
k*(7*) = 0 where 


K*(a) = B*(B*fa]—1)-a = Y mri(a). (4.12) 


i€E 
Theorem 4.5 y < 7", with equality only when K;(y*) does not depend oni € E. 


Lemma 4.6 Let (di)ice be a given set of constants satisfying X icp Tiði = 0 
and define (a) as the eigenvalue with maximal real part of the matric A + 
a(d;)diage Then (a) > 0, with strict inequality unless a = 0 or 6; = 0 for all 


TEE, 
t 
Xt = f Ou. ds. 
0 


Then {(, X+) } is a Markov additive process (a so-called Markovian fluid model, 
cf. e.g. Asmussen [62]) as discussed in ITI.5, and by Proposition ITII.4.2 we have 


Proof. Define 


(Bile? T= Gl) gg eA) axe, 
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Further (see Corollary III.4.7) A is convex with 


/ cS : pas a 

N(0) = jim = = Doms = 0, (4.13) 
1€E 

M'O) = Jim “are (4.14) 


By convexity, (4.13) implies A(@) > 0 for all a. 
Now we can view {X;} as a cumulative process (see 1d) with generic cycle 


w = inf{t>0: h- Ahk los} 


(the return time of k) where k € E is some arbitrary but fixed state. It is clear 
that the distribution of X, is non-degenerate except when 6; does not depend 
on i € E, which in view of } epg 7:6; = 0 is only possible if 6; = 0 for all i € E. 
Hence if 6; 4 0 for some i € F, it follows by Proposition A1.4(b) that the limit 
in (4.14) is non-zero so that A”(0) > 0. This implies that A is strictly convex, 
in particular A(a) > 0 for all a £0. 


Proof of Theorem 4.5. Let 6; = &;(7*), a = 1 in Lemma 4.6. Then >> 76; = 0 
because of (4.12) and «*(y*) = 0. Further A(1) = «(y*) by definition of X(-) 
and «(-). Hence «(7*) > 0. Since « is convex with «/(0) < 0, this implies that 
the solution y > 0 of k(y) = 0 must satisfy y < y*. If «;(7*) is not a constant 
function of i € E, we get «(y*) > 0 which in a similar manner implies that 
ee Ae 
Notes and references Theorem 4.5 is from Asmussen & O’Cinneide [93], improv- 
ing upon more incomplete results from Asmussen, Frey, Rolski & Schmidt [78]. 


4c Sensitivity estimates for the adjustment coefficient 


Now assume that the intensity matrix for the environment is A, = Ao/e, whereas 
the Ø; and B; are fixed. The corresponding adjustment coefficient is denoted by 
y(e). Thus y(€) > 7* as e | 0, and our aim is to compute the sensitivity 


oy 
Oc 


e=0 


A dual result deals with the limit € — oo. Here we put a = 1/e, note that 
qla) > minj=1,....» Yi and compute 


PDTS 


al 
Oa 


a=0 
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In both cases, the basic equation is (A + (K;(7))aiag)h = 0, where A,y,h 
depend on the parameter (e€ or a). 


In the case of e, multiply the basic equation by e to obtain 
0 = (Ao + €(Ki(7))diag) A, 
0 ((4i(7)) diag F ey (Ki (7) diag) h T (Ao R €(Ki() diag) A’. (4.15) 


Normalizing h by mh = 0, we have th’ = 0, h(0) = e. Hence letting € = 0 in 
(4.15) yields 


0 = (Kil7")) ging + Aoh (0) = (Ki(7*)) aage + (Ao — er)h' (0), 
h’(0) —(Ao — em)" (mil) aage (4.16) 


II 


II 


Differentiating (4.15) once more and letting € = 0 we get 
0 = 27'(0)(i(7")) ging + 2(i(7")) ding! (0) + Aoh” (0), (4.17) 
0 = 2)/(0)pt+ 2m (Ki(Y")) ging!’ (0), (4.18) 
multiplying (4.17) by m to the left to get (4.18). Inserting (4.16) yields 
Proposition 4.7 a = Lal) (Ao — en) (ri("))a e. 
e |e p iag iag 
Now turn to the case of a. We assume that 
0<y <7, i=2,...,p (4.19) 


Then y > 7 as a | 0 and we may take h(0) = e; (the first unit vector). We 
get 


= (aAo + (i (7))aiag) 
= (Ao + P(O) diag) + (ado + (i (7) diag) K. (4.20) 
Letting a = 0 in (4.20) and multiplying by e; to the left we get 0 = Ay + 


y'(0)k1 (0) +0 (here we used «1(7(0)) = 0 to infer that the first component of 
K[7(0)]h'(0) is 0), and we have proved: 


a À 
Proposition 4.8 If (4.19) holds, then x ge D . 


Notes and references The results are from Asmussen, Frey, Rolski & Schmidt 
[78]. The analogue of Proposition 4.8 when y; < 0 for some i is open. 
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5 The Markovian arrival process 


We shall here briefly survey an extension of the model, which has recently re- 
ceived much attention in the queueing literature, and has some relevance in risk 
theory as well. 

The additional feature of the model is the following: 


e Certain transitions of {J+} from state i to state j are accompanied by a 
claim with distribution B,;; the intensity for such a transition (referred to 
as marked in the following) is denoted by A and the remaining intensity 
for a transition i — j by (thus Aij = NY + ay For i = j, we use 
the convention that a2 = ß; where 6; is the Poisson rate in state 7, that 
Bi; = B;, and that the AÙ are determined by A = A“ + A® where A 
is the intensity matrix governing {J+}. 


Thus, the Markov-modulated compound Poisson model considered so far corre- 
sponds to A®) = (3;)aiag, A = A — (Bi)aiagi Bii = Bi; the definition of B;; is 
redundant for i Æ j. 

Note that the case that 0 < qij < 1, where qij is the probability that a 
transition 7 — j is accompanied by a claim, is covered by letting B,; have an 
atom of size qij at 0. 

Again, the claim surplus is a Markov additive process (cf. III.3). The ex- 
tension of the model can also be motivated via Markov additive processes: if 
{N,} is the counting process of a point process, then {N;} is a Markov additive 
process if and only if it corresponds to an arrival mechanism of the type just 
considered. 

Here are some main examples: 


Example 5.1 (PHASE-TYPE RENEWAL ARRIVALS) Consider a risk process 
where the claim sizes are i.i.d. with common distribution B, but the point pro- 
cess of arrivals is not Poisson but renewal with interclaim times having common 
distribution A of phase-type with representation (v, T). In the above setting, 
we may let {J} represent the phase processes of the individual interarrival times 
glued together (see further IX.2 for details), and the marked transitions are then 
the ones corresponding to arrivals. This is the only way in which arrivals can 
occur, and thus 


the definition of B; is redundant because of 8; = 0. 
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Example 5.2 (SUPERPOSITIONS) A nice feature of the set-up is that it is closed 


under superposition of independent arrival streams. Indeed, let LIY, TJP) 
(k) 
tj 

etc. refer to TION, We then let (see the Appendix for the Kronecker notation) 


be two independent environmental processes and let B®), AGG) AGE) B 


E=E® x £2, J= (I,J), 
AY = AHH) e AM?) | A® = ACD o A?) 
1 2 
Bijng = By, Bijan = BY 


(the definition of the remaining B;;,,¢ is redundant). In this way we can model, 
e.g., superpositions of renewal processes. 


Example 5.3 (AN INDIVIDUAL MODEL) In contrast to the collective assump- 
tions (which underly most of the topics treated so far in this book and lead to 
Poisson arrivals), assume that there is a finite number N of policies. Assume fur- 
ther that the ith policy leads to a claim having distribution C; after a time which 
is exponential, with rate a;, say, and that the policy then expires. This means 
that the environmental states are of the form iiiz- -in with 71, %2,... € {0,1}, 
where i, = 0 means that the kth policy has not yet expired and 7, = 1 that it 
has expired. Thus, claims occur only at state transitions for the environment so 
that 
Adig--in ligerin = 01, Boig--in lige-in = C1, 


NiO ingilin = 02, Bio--ingyiin = Co, 


All other off-diagonal elements of A are zero so that all other Bi; are redundant. 
Similarly, all (;,;,...;,, are zero and all B; are redundant. Easy modifications 
apply to allow for 


N 
e the time until expiration of the kth policy is general phase-type rather 
than exponential, 


e upon a claim, the kth policy enters a recovering state, possibly having a 
general phase-type sojourn time, after which it starts afresh. 


Example 5.4 (A SINGLE LIFE INSURANCE POLICY) Consider the life insurance 
of a single policy holder which can be in one of several states, E = {WORKING, 
RETIRED, MARRIED, DIVORCED, WIDOWED, INVALIDIZED, DEAD etc.}. The 
individual pays at rate p; when in state 7 and receives an amount having distri- 
bution B;; when his/her state changes from i to j. 
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Notes and references The point process of arrivals was studied in detail by 
Neuts [658] and is often referred to in the queueing literature as Neuts’ versatile point 
process, or, more recently, as the Markovian arrival process (MAP). However, the idea 
of arrivals at transition epochs can be found in Hermann [460] and Rudemo [755]. 

The versatility of the set-up is even greater than for the Markov-modulated model. 
In fact, Hermann [460] and Asmussen & Koole [88] showed that in some appropriate 
sense any arrival stream to a risk process can be approximated by a model of the 
type studied in this section: any marked point process is the weak limit of a sequence 
of such models. For the Markov-modulated model, one limitation for approximation 
purposes is the inequality Var Nz > EN; which needs not hold for all arrival streams. 

Some main queueing references using the MAP are Ramaswami [722], Sengupta 
[794], Lucantoni [612], Lucantoni et al. [612], Neuts [662] and Asmussen & Perry [95]. 
For recent applications in risk theory, see e.g. Badescu, Drekic & Landriault [118] and 
Cheung & Landriault [239]. 


6 Risk theory in a periodic environment 


6a The model 


We assume as in the previous part of the chapter that the arrival mechanism 
has a certain time-inhomogeneity, but now exhibiting (deterministic) periodic 
fluctuations rather than (random) Markovian ones. Without loss of generality, 
let the period be 1; for s € E = [0,1), we talk of s as the ‘time of the year’. The 
basic assumptions are as follows: 


e The arrival intensity at time t of the year is G(t) for a certain function 
B(t),0<t<1; 


e Claims arriving at time t of the year have distribution BM; 
e The premium rate at time t of the year is p(t). 


By periodic extension, we may assume that the functions ((t), p(t) and B® are 
defined also for t ¢ [0, 1). Obviously, one needs to assume also (as a minimum) 
that they are measurable in t; from an application point of view, continuity 
would hold in presumably all reasonable examples. 

We denote throughout the initial season by s and by P“) the corresponding 
governing probability measure for the risk process. Thus at time t the premium 
rate is p(s + t), a claim arrives with rate B(s + t) and is distributed according 
to B&st*), Let 


1 1 1 
6“ -f b(t)dt, B* =f gof dt, r= | p(t)dt. (6.1) 
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Then the average arrival rate is 3* and the safety loading 7 is n = (p* — p)/p, 
where 


- f pw)av [ 2B (dz) = Brut 
p= f Bodu f eB(ar) = Bn (6.2) 


Note that p is the average net claim amount per unit time and u* = p/8* the 
average mean claim size. 

In a similar manner as in Proposition 1.8, one may think of the standard 
compound Poisson model with parameters 3*, B*, p* as an averaged version 
of the periodic model, or, equivalently, of the periodic model as arising from 
the compound Poisson model by adding some extra variability. Many of the 
results given below indicate that the averaged and the periodic model share a 
number of main features. In particular, it turns out that they have the same 
adjustment coefficient. In contrast, for Markov-modulated model typically the 
adjustment coefficient is larger than for the averaged model (cf. Section 4b), in 
agreement with the general principle of added variation increasing the risk (cf. 
the discussion in IV.9). The behavior of the periodic model does not need to be 
seen as a violation of this principle, since the added variation is deterministic, 
not random. 


Example 6.1 As an example to be used for numerical illustration throughout 
this section, let G(t) = 3A(1 + sin 27t), p(t) = » and let B® be a mixture 
of two exponential distributions with intensities 3 and 7 and weights w(t) = 
(1 + cos 27t)/2 and 1 — w(t), respectively. 

It is easily seen that 6* = 3A, p* = A whereas B* is a mixture of expo- 
nential distributions with intensities 3 and 7 and weights 1/2 for each (1/2 = 
‘le w(t) dt = ha — w(t))dt). Thus, the average compound Poisson model is 
the same as in IV.(3.1) and Example 1.10, and we recall from there that the 
ruin probability is i j 

* —6u 
vu) = 35 3° CO (6.3) 
Note that enters just as a scaling factor of the time axis, and thus the averaged 
standard compound Poisson models have the same risk for all A. In contrast, 
we shall see that for the periodic model increasing A increases the effect of the 
periodic fluctuations. 


Remark 6.2 Define 


e" + 


T 
O(T) = | p(t)dt, S,= So-1(t)- 
0 


Then (by standard operational time arguments) {S} is a periodic risk process 
with unit premium rate and the same infinite horizon ruin probabilities. We 
assume in the rest of this section that p(t) = 1. 
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The arrival process {N;},59 is a time-inhomogeneous Poisson process with 
intensity function { B(s + t)} t>o° The claim surplus process {St}iso is defined 
in the obvious way as S; = a U; — t. Thus, the conditional distribution 
of U; given that the ith claim occurs at time t is B+. As usual, (u) = 
inf {t > 0: S, > u} is the time to ruin, and the ruin probabilities are 


p® (u) = PS) (r(u) <œ), pb (u, T) = P& (r(u) < T). 


The claim surplus process {S;} may be seen as a Markov additive process, 
with the underlying Markov process {J;} being deterministic period motion on 
E = (0,1), ie. 

J, =(s+t) mod1 P®)-a.s. (6.4) 


At a first sight this point of view may appear quite artificial, but it turns out 
to have obvious benefits in terms of guidelining the analysis of the model as a 
parallel of the analysis for the Markovian environment risk process. 


Notes and references The model has been studied in risk theory by, e.g., Daykin 
et.al. [279], Dassios & Embrechts [273] and Asmussen & Rolski [97], [98] (the literature 
in the mathematical equivalent setting of queueing theory is somewhat more exten- 
sive, see the Notes to Section 7). The exposition of the present chapter is basically 
an extract from [98], with some variants in the proofs. Recently, Kotter & Bauerle 
[558] addressed the stochastic optimization problem to minimize the ruin probability 
through investment in the framework of a periodic environment, see also Chapter XIV. 


6b Lundberg conjugation 


Motivated by the discussion in Chapter III.4 (see in particular Remark III.4.8), 
we start by deriving formulas giving the m.g.f. of the claim surplus process. To 
this end, let 


A s+1 A 
K*(a) = 6*(B*[a]-—1)-a = / B(v)(B™ [a] - 1) dv- a 


be the c.g.f. of the averaged compound Poisson model (the last expression is 
independent of s by periodicity), and define 


h(s;a@) = exp -f [B() (B [a] —l1)-a- K*(a)| av} : 


then h(-; a) is periodic on R. 
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h(s; a) ets” (a) 
h(s +t; a) 


Proof. Conditioning upon whether a claim occurs in ft, t+ dé] or not, we obtain 


Theorem 6.3 E()e%St = 


19 [arson 
= (1-G(st thdt)er oe" + B(s + t)dt-e°* BEM [a] 
— es. (1 — adt + B(s +4 t)at[ B+ [a] — 1]), 


(eSa — EOS (1 — adt + A(s + t)dt[BEt [a] — 1]), 
Troes = BMS (-a4 A(s+ [Ba] 1), 
d be 
T logEe% = —a+ p(s +H [BE [a] - 1], 
t 
log Eee = —at +f B(s +) (Ber? [a] — 1)dv 
0 
= log h(s +t;a)— log h(s; a), 
where 
a j l t ae (Bi i 1)a etr” (a) 
t;a) = exp | v Nal — v-at} = . 
0 h(t; a) 
Thus y 
E) eest = h(s +t; a) E h(s; a) ett” (a), (6.5) 
h(s; a) h(s + t;a) 


Corollary 6.4 For each 0 such that the integrals in the definition of h(t; 0) exist 


and are finite, 
h(s +t;0) 95,-t0*(0 
Lo, = faa eae LP) 
{ t>o h(s; 0) i 


is a P) -martingale with mean one. 
Proof. In the Markov additive sense of (6.4), we can write 


Los = h(J5 0) 0st" (0) 


h( Jo; 0) 


P(s)-a.s. so that obviously {Lo +} is a multiplicative functional for the Markov 
process {(Jz,S;)}. According to Remark III.1.8, it then suffices to note that 
(5) Lg, = 1 by Theorem 6.3. 
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Remark 6.5 The formula for h(s) = h(s; a) as well as the fact that « = k“ (a) 
is the correct exponential growth rate of Ee®° can be derived via Remark III.4.9 
as follows. With Ẹ the generator of {X+} = {(J:, S:)} and hals, y) = e®”h(s), 
the requirement is hali, 0) = Kh(s). However, as above 


O) ha (Jat, Sat) 
= h(s+dt)e~°(1— (s)dt) + B(s)dt - B® [a]h(s) 
= h(s) +dt{—ah(s) — 6(s)h(s) + h(s) + 6(s)B [a]h(s)}, 
Pro(s,0) = —ah(s) — B(s)h(s) + h'(s) + B(s)B[alh(s). 
Equating this to «h(s) and dividing by h(s) yields 
h'(s) 
h(s) 
h(s) = exp \-{ [B(v)(B [a —1)-a- J ac} 


(normalizing by h(0) = 1). That « = «*(a) then follows by noting that h(1) = 
h(0) by periodicity. 


= at f(s) — 6(s)B [a] + x, 


For each @ satisfying the conditions of Corollary 6.4, it follows by Theorem 
III.1.7 that we can define a new Markov process {(J;,.5;)} with governing prob- 


ability measures pis), say, such that for any s and T < ow, the restrictions of 


P‘) and ps) to F, are equivalent with likelihood ratio Lg ,r. 


Proposition 6.6 The Pi), 0<s <1, correspond to a new periodic risk model 
with parameters 


Ox 
t) = B(t)B©(9), B® (ax) = —— BO (az). 
Bolt) = BOBO, BP = ge BO Cae) 
Proof. (i) Check that m.g.f. of S, is as for the asserted periodic risk model, cf. 
Proposition 6.3; (ii) use Markov-modulated approximations (Section 6c); (iii) 
use approximations with piecewiese constant 3(s), B®); (iv) finally, see [98] for 
a formal proof. 


Now define y as the positive solution of the Lundberg equation for the av- 
eraged model. That is, y solves «*(y) = 0. When a = y, we put for short 
h(s) = h(s;y). A further important constant is the value yo (located in (0,7)) 
at which «*(a) attains its minimum. That is, yo is determined by 


0 = K*(%) = BB* [yo] -1. (6.6) 
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Lemma 6.7 When a > y, PẸ? (r(u) < œ) =1 for allu > 0. 
Proof. According to (6.2), the mean number of claims per unit time is 
1 ee) 
Pa = T B(v) av | re BY) (dr) 
0 0 


= 8* po Bas) = 6* B* [al = « (a) +1, 
0 


which is > 1 by convexity. 


The relevant likelihood ratio representation of the ruin probabilities now 
follows immediately from Corollary III.1.5. Here and in the following, E(u) = 
S-(u) — u is the overshoot and 6(u) = (T(u) + s) mod 1 the season at the time 
of ruin. 


Corollary 6.8 The ruin probabilities can be computed as 


oak (u)+r(w) 8" (a) 


9) (u, T) = h(s;a)je"% p(s) nea ;T(u) <T], (6.7) 
E ee T i ET T 

Ou) = -yugt) oY 
yp (u) = h(s)e "EY ZUON (6.9) 


To obtain the Cramér-Lundberg approximation from Corollary 3.1, we need 
the following auxiliary result. The proof involves machinery from the ergodic 
theory of Markov chains on a general state space, which is not used elsewhere 
in the book, and we refer to [98]. 


Lemma 6.9 Assume that there exist open intervals I C [0,1), J C Ry such 
that the B®, s € I, have components with densities b) (x) satisfying 


inf KOLO > 0. (6.10) 


sel, xE 


Then for each a, the Markov process { (E(u), 0(u)) } 


erning probability measures {pO} 


u>o considered with gov- 


s€[0,1)’ has a unique stationary distribution, 
say the distribution of (E(co),(co)), and no matter what is the initial season 


s, (€(u),0(u)) 2 (El), 6(00)). 


Letting u — oo in (6.9) and noting that weak convergence entails convergence of 
Uf (E(u), @(u)) for any bounded continuous function (e.g. f(x, q) = e77 /h(q)), 
we get: 
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Theorem 6.10 Under the condition (6.10) of Lemma 3.1, 

wv) (u) ~ Ch(s)e"™, u— oo, (6.11) 
_ e (00) 
1 h(8(00)) ” 


Note that (6.11) gives an interpretation of h(s) as a measure of how the 
risks of different initial seasons s vary. For our basic Example 6.1, elementary 
calculus yields 


where C = 


1 1 1 
h(s) = exp fa (> cos 2Ts — E sin 27s + Toz 4S — =} ; 


Plots of h for different values of À are given in Fig. VII.4, illustrating that the 
effect of seasonality increases with A. 


FIGURE VII.4 


In contrast to h, it does not seem within the range of our methods to compute 
C explicitly, which may provide one among many motivations for the Markov- 
modulated approximation procedure to be considered in Section 6c. Among 
other things, this provides an algorithm for computing C as a limit. At this 
stage, Theorem 6.10 shows that certainly y is the correct Lundberg exponent. 

Noting that ¿(u) > 0 in (6.9), we obtain immediately the following version of 
Lundberg’s inequality which is a direct parallel of the result given in Corollary 
3.6 for the Markov-modulated model: 


Theorem 6.11 y®) (u) < CO n(s)e-™, where 
1 


CO re Sanat ees 
i info<t<1 h(t) 
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Thus, e.g., in our basic example with À = 1, we obtain (im = 1.42 so that 


1 1 1 9 
y® (u) < 1.42-exp { zz S 21s pe sin 27s 4 Tez °° dns — e e”, 
(6.12) 


As for the Markovian environment model, Lundberg’s inequality can be con- 
siderably sharpened and extended. We state the results below; the proofs are 
basically the same as in Section 3 and we refer to [98] for details. 

Consider first the time-dependent version of Lundberg’s inequality. Just as 
in V.4, we substitute T = yu in y(u, T) and replace the Lundberg exponent y 
by Yy = Qy — yk (ay), where a, is the unique solution of 

K' (ay) = = (6.13) 
y 
Elementary convexity arguments show that we always have yy > y and a, > y, 
K(Qy) > 0 when y < 1/K'(y), whereas ay < y, K(ay) < 0 when y > 1/K' (7). 


1 
Theorem 6.12 Let ron = —__... Then 
+ (¥) info<t<1 A(t; ay) 

1 
(uyu) < CPW)arlsje™, y<- (6.14) 

k’ (q) 

1 
Wu) -puyu < OPUS, y> y (6.15) 


The next result improves upon the constant oie in front of e~7™ in Theorem 
6.11 as well as it supplements with a lower bound. 


Theorem 6.13 Let 


Blt) 
OS A Pen 7G a i en 
Cy = sup : - SUP —5 B” (a) i (6.16) 
o<t<1ı h(t) s>0 f, e7®=®) BO (dy) 
Then for all s € [0,1) and all u > 0, 
Chl) < p® (u) < Cyh(s)e7™. (6.17) 


In order to apply Theorem 6.13 to our basic example, we first note that the 
function 


SZ {w 3673 + (1 — w) - eda 6w + 6(1 — w)e~4¥ 


fo et-4{w-3e-3% + (L—w)-7e“™*}dz = 9w+ 7(1— we 
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attains its minimum 2/3 for u = oo and its maximum 6/(7 + 2w) for u = 0. 
Thus 


2 1 Te 1 9 
CS. <= 3 pyre {-A (> cos 27s — ir sin 27s + lon cos 47s — w) \ 
_ 2. 0.013. 
= 3° : 
Oe. <8 Aa 6exp {—A (+ cos 27s — sin 27s + Tez cos 4rs — z)} 
7 0<s<1 8 + cos 278 
Thus e.g. for A = 1 (where $e~°-°' = 0.66, Cy = 1.20), 
1 1 1 9 
w)(u) > 0.66-exp {= cos 27s — is sin 27s + en 4ns — =} e“, 
w(u) < 1.20- exp Š cos 277s — 2, sin 27s + ee cos 47s — —— pe". 
a 27 4n 167 167 


Finally, we have the following result: 


Theorem 6.14 Let Cy(yo) be as in (6.16) with y replaced by yo and h(t) by 
h(t; yo), and let 6 = e"o), Then 


0 < y(u) -yh (u, T) < Cy(yo)h(s;y)e7"5". (6.18) 


Notes and references The material is from Asmussen & Rolski [98]. Some of the 
present proofs are more elementary by avoiding the general point process machinery 
of [98], but thereby also slightly longer. 


6c Markov-modulated approximations 


A periodic risk model may be seen as a varying environment model, where the 
environment at time t is (s + t) mod 1 € [0,1), with s the initial season. Of 
course, such a deterministic periodic environment may be seen as a special case 
of a Markovian one (allowing a continuous state space E = [0,1) for the envi- 
ronment), and in fact, much of the analysis of the preceding section is modelled 
after the techniques developed in the preceding sections for the case of a finite 
E. This observation motivates to look for a more formal connection between 
the periodic model and the one evolving in a finite Markovian environment. 
The idea is basically to approximate the (deterministic) continuous clock by 
a discrete (random) Markovian one with n ‘months’. Thus, the nth Markovian 
environmental process {J;} moves cyclically on {1,...,n}, completing a cycle 
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within one unit of time on the average, so that the intensity matrix is A”) given 
by 


0 =n n > 0 
AM = a eee l (6.19) 
n 0 0 > =n 


Arrivals occur at rate Bni and their claim sizes are distributed according to Bni 
if the governing Markov process is in state i. We want to choose the Bni and 
Bni in order to achieve good convergence to the periodic model. To this end, 
one simple choice is 


i—1 


TE ) and By, = BCD/), (6.20) 


but others are also possible. We let {s™} >o be the claim surplus process of 


the nth approximating Markov-modulated model, M™ = SUP;>0 sm, and the 
ruin probability corresponding to the initial state 7 of the environment is then 


w(t) =P,(M™ > 2), (6.21) 


which serves as an approximation to #)(u) whenever n is large and i/n ~ s. 


Notes and references See Rolski [745]. 


7 Dual queueing models 


The essence of the results of the present section is that the ruin probabilities 
pilu), pilu, T) can be expressed in a simple way in terms of the waiting time 
probabilities of a queueing system with the input being the time-reversed input 
of the risk process. This queue is commonly denoted as the Markov-modulated 
M/G/1 queue and has received considerable attention in the last decades. Thus, 
since the settings are equivalent from a mathematical point of view, it is desirable 
to have formulas permitting freely to translate from one setting into the other. 

Let 6i, Bi, A be the parameters defining the risk process in a random en- 
vironment and consider a queueing system governed by a Markov process {JJ;‘} 
(‘Markov-modulated’) as follows: 


e The intensity matrix for {J;‘} is the time-reversed intensity matrix A* = 
(Aj; )izee Of the risk process, A}; = AjiTj/Ti. 


e The arrival intensity is 6; when Jf = îi; 
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e Customers arriving when J; = i have service time distribution B,; 
e The queueing discipline is FIFO. 


The actual waiting time process {W,,},_,5 . and the virtual waiting time 
(workload) process {V;},s9 are defined exactly as for the renewal model in Chap- 
ter VI. 7 


Proposition 7.1 Assume Vo =0. Then 


P,(r(u) < T, Jr =j) = “ip; (Vr > u, J} = i). (7.1) 
In particular, 
1 
pilu, T) = —Px (Vr >u, Jp =i) = Py (Vr > ul Jp =i), (7.2) 
1 : 
pilu) = —PV >u, J" =i) = Pa(V > ul J" =i), (7.3) 


where (V, J*) is the steady-state limit of (Vi, JE). 


Proof. Consider stationary versions of {Jt}o<i<cr: {Ji }o<t<r: Then we may 
assume that Jf = Jr-t, 0 < t < T and that the risk process {Ri}yc,ep is 
coupled to the virtual waiting process {Vi})<;<7 as in the basic duality lemma 
(Theorem III.2.1). The first conclusion of that result then states that the events 
{r(u) < T, Jo = i, Jr = j} and {Vr > u, Jò = j, J} = i} coincide. Taking prob- 
abilities and using the stationarity yields 


TP; (r(u) <T, Jr = j) = mjP;(Vr > u, Jp = i), 


and (7.1) follows. For (7.2), just sum (7.1) over j, and for (7.3), let T — œ in 
(7.2) and use that lim P; (Vr > u, J = i) = P(V > u, J* = i) for all j. 


Now let Iš denote the environment when customer n arrives and I* the 
steady-state limit. 


Proposition 7.2 The relation between the steady-state distributions of the ac- 
tual and the virtual waiting time distribution is given by 


b 


P(W >u, I* =i) = gP > u, J* =i), (7.4) 
where B* = X` ep Tjbj. In particular, 
vi(u) = “pw sul =2. (7.5) 


= Tibi 
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Proof. Identifying the distribution of (W, I*) with the time-average, we have 


However, if T is large, on average @*T customers arrive in [0,7], and of these, 
on average 3;TP(V > u,J* = i) see W > u,I* = i. Taking the ratio yields 
(7.4), and (7.5) follows from (7.4) and (7.3). 


Notes and references One of the earliest papers drawing attention to the Markov- 
modulated M/G/1 queue is Burman & Smith [210]. The first comprehensive solution 
of the waiting time problem is Regterschot & de Smit [729], a paper relying heavily 
on classical complex plane methods. A more probabilistic treatment was given by 
Asmussen [59], and further references (to which we add Prabhu & Zhu [714]) can be 
found therein. 

Proposition 7.1 is from Asmussen [58], with (7.3) improving somewhat upon (2.7) 
of that paper. The relation (7.4) can be found in Regterschot & de Smit [729]; a general 
formalism allowing this type of conclusion is ‘conditional PASTA’, see Regterschot & 
van Doorn [327]. 

In the setting of the periodic model of Section 6, the dual queueing model is a 
periodic M/G/1 queue with arrival rate G(—t) and service time distribution BW" at 
time t of the year (assuming w.l.o.g. that 6(t), B® have been periodically extended 
to negative t). With {V;} denoting the workload process of the periodic queue, p < 1 
then ensures that V“ = limy—oo Vn+s exists in distribution, and one has 


p(s) (r(u) < T) = PO-8-T) (Vp >u), (7.6) 
p(—s—T) (r(u) x T) = P® (Vp > u), (7.7) 
p(1—s) (r(u) < oo) = Po) (yO > u). (7.8) 


For treatments of periodic M/G/1 queues, see in particular Harrison & Lemoine [450], 
Lemoine [579, 580], and Rolski [745]. 
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Chapter VIII 


Level-dependent risk 
processes 


1 Introduction 


We assume as in Chapter IV that the claim arrival process {N+} is Poisson with 
rate 6, and that the claim sizes U1, U2,... are i.i.d. with common distribution 
B and independent of {N;}. Thus, the aggregate claims in (0, t] are 


A, = 900; (1.1) 


(other terms are accumulated claims or total claims). However, the increase of 
the surplus process R; in between the claim payments now does not have to be 
linear with constant slope, but can depend on the current surplus level. This can 
always be interpreted as a modified premium rate p(r) charged at the current 
reserve R, = r (but note that the actual reason for the level dependence of 
the increase may be quite different, see the examples below). Thus, in between 
jumps, {R,} moves according to the differential equation R = p(R), and the 
evolution of the reserve may be described by the equation! 


t 
0 


Here it is assumed that p(r) is a deterministic function. Stochastic p(r) will be discussed 
in Sections 5 and 6. 
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As earlier, 
plu) = P (inf Ry <0 | Ree u), plu, T) = P( int Ri <0 | f= u) 


denote the ruin probabilities with initial reserve u and infinite, resp. finite hori- 
zon, and 7(u) = inf {t > 0: Ry < u} is the time to ruin starting from Ro = u 
so that y(u) = P(r(u) < œ), y(u, T) = P(r(u) < T). 

The following examples provide some main motivation for studying the 
model: 


Example 1.1 Assume that the company reduces the premium rate from pı to 
p2 when the reserve comes above some critical value v. That is, pı > p2 and 


E (1.3) 


One reason could be competition, where one would try to attract new customers 
as soon as the business has become reasonably safe. Another could be the pay- 
out of dividends: here the premium paid by the policy holders is the same for 
all r, but when the reserve comes above v, dividends are paid out at rate pı — 
p2. Possibilities for more general level-dependent premium (dividend payment) 
schemes than the two-step rule above are obvious. 


Example 1.2 (INTEREST) If the company charges a constant premium rate 
p but invests its money at interest rate i, we get p(r) = p + ir. 


Example 1.3 (ABSOLUTE RUIN) Consider the same situation as in Example 
1.2, but assume now that the company borrows the deficit in the bank when 
the reserve goes negative, say at interest rate i’. Thus at deficit x > 0 (meaning 
R, = —x), the payout rate of interest is i'x and absolute ruin occurs when this 
exceeds the premium inflow p, i.e. when x > p/i’, rather than when the reserve 
itself becomes negative. In this situation, we can put Ri = Re+ p/i, 


a _ f pti(r—p/i) r>p/, 
P(r) = ear O<r<p/i. 


Then the ruin problem for {R} is of the type defined above, and the probability 
of absolute ruin with initial reserve u € [—p/i’, oo) is given by y(u + p/i’). 


Example 1.4 (TAX) If the insurance company makes profit, it will have to 
pay tax. One way to model this is to assume that whenever the risk process Rg 
is in a running maximum, a certain proportion V of the premium income is paid 
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to the tax authority (such a model is related to the so-called loss-carried-forward 
scheme). The resulting premium rule is p(r) = Vp in the running maxima and 
p(r) = p otherwise. Due to the non-Markovian character, the analysis for this 
model is somewhat different from the above examples, see Section 4. 


Now let us return to the general Markovian model. 
Proposition 1.5 Either y(u) =1 for allu, or y(u) <1 for all u. 


Proof. Obviously y(u) < w(v) when u > v. Assume y(u) < 1 for some u. If 
Ro = v < u, there is positive probability, say €, that {R:} will reach level u 
before the first claim arrives. Hence in terms of survival probabilities, 1 — (v) 
> e(1 — y(u)) > 0 so that y(v) < 1. 


A basic question is thus which premium rules p(r) ensure that y(u) < 1. 
No tractable necessary and sufficient condition is known in complete generality 
of the model. However, it seems reasonable to assume monotonicity (p(r) is 
decreasing in Example 1.1 and increasing in Example 1.2) for r sufficiently large 
so that p(œ0) = lim,_... p(r) exists. This is basically covered by the following 
result (but note that the case p(r) | Bug requires a more detailed analysis 
and that ug < co is not always necessary for y(u) < 1 when p(r) — œ, cf. 
[APQ, pp. 388-389]): 


Theorem 1.6 (a) If p(r) < Gus for all sufficiently large r, then y(u) = 1 for 
all u; 

(b) If p(r) > Bus +e for all sufficiently large r and some € > 0, then y(u) < 1 
for allu, and P(R; — oo) > 0. 


Proof. This follows by a simple comparison with the compound Poisson model. 
Let p(w) refer to the compound Poisson model with the same 8, B and (con- 
stant) premium rate p. 

In case (a), choose uo such that p(r) < p = Bus for r > uo. Starting 
from Ro = uo, the probability that Rs < uo for some t is at least (0) = 1 (cf. 
Proposition IV.1.2(d)), hence R; < uo also for a whole sequence of t’s converging 
to oo. However, obviously inf,<,, Y(u) > 0, and hence by a geometric trials 
argument (uo) = 1 so that y(u) = 1 for all u by Proposition 1.5. In case 
(b), choose up such that p(r) > p = Bug + «€ for r > uo. Then if u > uo, we 
have y(u) < Yp(u — uo) and, appealing to Proposition IV.1.2 once more, that 
Wp(u— uo) < 1. Hence y(u) < 1 for all u by Proposition IV.1.2(d). 


We next recall the following result, which was proved in III.3. Here {Vi},so 
is a storage process which has reflection at zero and initial condition Vo = 0. 
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In between jumps, {V;} decreases at rate p(v) when V; = v (ie., V= —p(V)). 
That is, instead of (1.2) we have 


V, = a-f p(Vs)ds, (1.4) 


and we use the convention p(0) = 0 to make zero a reflecting barrier (when 
hitting 0, {V;} remains at 0 until the next arrival). 


Theorem 1.7 For any T < œ, one can couple the risk process and the stor- 
age process on [0,T] in such a way that the events {r(u) < T} and {Vr > u} 
coincide. In particular, 


y(u, T) = P(Vr >u), (1.5) 


and the process {V;} has a proper limit in distribution, say V, if and only if 
y(u) <1 for allu. Then 
y(u) = P(V >u). (1.6) 


In order to make Theorem 1.7 applicable, we thus need to look more into 
the stationary distribution G, say, for the storage process {V;}. It is intuitively 
obvious and not too hard to prove that G is a mixture of two components, one 
having an atom at 0 of size go, say, and the other being given by a density g(x) 
on (0,00). It follows in particular that 


vw = f “ai (1.7) 


Proposition 1.8 


p(z)g(2) = goSBle) +8 J "PORIN: (1.8) 


Proof. In stationarity, the flow of mass from [0,2] to (x,o0) must be the same 
as the flow the other way. In view of the path structure of {V;}, this means that 
the rate of upcrossings of level x must be the same as the rate of downcrossings. 
Now obviously, the l.h.s. of (1.8) is the rate of downcrossings (the event of an 
arrival in [t,¢ + dé] can be neglected so that a path of {V,} corresponds to a 
downcrossing in [¢,¢ + dt] if and only if V, € [2,2 + p(a)dt]). An attempt of 
an upcrossing occurs as a result of an arrival, say when {V;} is in state y, and 
is successful if the jump size is larger than x — y. Considering the cases y = 0 
and 0 < y < x separately, we arrive at the desired interpretation of the r.h.s. of 
(1.8) as the rate of upcrossings. 
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Define 
Then w(x) is the time it takes for the reserve to reach level x provided it starts 


with Ro = 0 and no claims arrive. Note that it may happen that w(x) = oo for 
all x > 0, say if p(r) goes to 0 at rate r or faster as r | 0. 


Corollary 1.9 Assume that B is exponential with rate 6, a )= -2 and that 
w(x) <œ for alla > 0. Then the ruin probability is y(u F g(y) dy, where 
gob 1 ” B 
x exp{ Gw(x) — da} and =14 I exp; w(x) — da}da. 
2) = (a) Pt De ges payee 


Proof. We may rewrite (1.8) as 


= 1 —d2x —ôx 5 ô _ p —ôx 
g(x) = p(x Fonts °° + Be i gly) dy} = ple)” K(x) 
where K(x) = go + fy e®¥g(y) dy so that 
Ka) = egla) = nla) 
Thus 
log k(x) = gs0) f at = log K(0) + Bu(z), 
K(x) = K(0)e5(*) = geh, 
glz) = e xi (£) = eî? go Bus! (xe) 


which is the same as the expression in (1.9). That go has the asserted value is 
a consequence of 1 = ||G]| = go + Jy” 9(y)dy. 


Remark 1.10 The exponential case in Corollary 1.9 is the only one in which 
explicit formulas are known (or almost so; see further the notes to Section 2), 
and thus it becomes important to develop algorithms for computing the ruin 
probabilities. We next outline one possible approach based upon the integral 
equation (1.8) (another one is based upon numerical solution of a system of 
differential equations which can be derived under phase-type assumptions, see 
further IX.7). 
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A Volterra integral equation has the general form 


FO E f "RE (1.10) 


where g(x) is an unknown function (x > 0), h(x) is known and K(x, y) is a 
suitable kernel. Dividing (1.8) by p(x) and letting 


ØBle-y) i Bla) 
pa ae 


we see that for fixed go, the function g(x) in (1.8) satisfies (1.10). For the 
purpose of explicit computation of g(x) (and thereby w(u)), the general theory 
of Volterra equations does not seem to lead beyond the exponential case already 
treated in Corollary 1.9. However, one might try instead a numerical solution. 
We consider the simplest possible approach based upon the most basic numerical 
integration procedure, the trapezoidal rule 


K(z,y) = 


9 


f7 tode = Fl tle) +2 Cer) +27 (ea) + +2F ena) + Slew), 


where £k = zo + kh. Fixing h > 0, letting xp = 0 (i.e. x, = kh) and writing 
Gk = g(x), Kee = K(x, £e), this leads to 


h 
gn = hn + 3 {Kn o90 + Kn.wgn} +h{Kwigit+-:-+ Kn.n-19n-1}, 


_ hawt EKN ogo t+ h{Knigi +++: + Kn,n-19gn-1} 
1 ye l 


In the case of (1.8), the unknown go is involved. However, (1.11) is easily 
seen to be linear in go. One therefore first makes a trial solution g*(x) corre- 
sponding to go = 1, i.e. h(x) = h* (x) = BB(x)/p(x), and computes tar, g* (x)dx 
numerically (by truncation and using the g%). Then g(x) = gog*(x), and ||G|| 
= 1 then yields 


gN (1.11) 


1 co 
—=1 +f g* (x)da (1.12) 
go 0 


from which go and hence g(x) and y(u) can be computed. 


Remark 1.11 Plugging (1.7) into (1.8), one obtains by partial integration and 
reordering 


p(u)db!(u) — Bv(u) +B | “p(u—y)dBY)+6Blu) = 0, UD 
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where in the last term it was used that go + (0) = 1. It is also possible to 
derive (1.13) directly (without reference to storage processes) in the following 
way. For h > 0, consider the time interval (0,/) and condition on the time t 
and the amount y of the first claim in (0,h). Since the probability that there is 
no claim in (0, h) is e~%" and the probability that the first claim occurs between 
time t and t + dt is e~**Gdt, one obtains, using the Markov property of the 
process Ri, 


w(u) = eP (u+ fÈ p(R s) ds) + feMaarB(u+ Siol Rs) ds) 
h u+ fj p(Rs)ds 
+fe pat f w(ut fo v(Rs)ds — y) dB(y). 
0 0 


Since every other part of the above equation is differentiable w.r.t. h, also (ut 


4 p(R,) ds) has to be (by symmetry this also establishes the differentiability 
w.r.t. u). Taking the derivative w.r.t. h and subsequently setting h = 0 then 
gives (1.13). The formal framework for this approach are of course generators 
(cf. Chapter IT). 

For the particular case B(x) = e°”, one can multiply (1.13) by eĉ! and 
differentiate the resulting equation w.r.t. u. In this way one obtains the second- 
order differential equation 


plu) Y” (u) + (p'(u) + dp(u) — B) y(u) = 0 


with the boundary conditions p(0)w'(0) = (~(0) — 1) and limy.x %(u) = 0. 
This leads to 


—ôx 


BJZ pe —dvdy 
EY L oo ido’ 


vu) = (1.14) 


which is again the result of Corollary 1.9. 


la Two-step premium functions 


We now assume the premium function to be constant in two levels as in Example 
1.1, 


_jJprrsv 
p= se Se (1.15) 


We may think of the risk reserve process R+ as pieced together from two risk 


1) 


reserve processes Ri and RP with ee es Pı, p2, such that Ri 


coincide with R? ) under level v and with RP above level v. If, as outlined 
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in Example 1.1, the reduced income above v is due to dividend payments, this 
model is usually referred to as the threshold dividend model.? For an example 
of a sample path of such a refracted process, see Fig. VIII.1. 


FIGURE VIII.1 


Proposition 1.12 Let 7) (u) denote the ruin probability of TRO’, define o = 
inf{t>0: Ri < v}, let r(u) be the probability of ruin between o and the next 
upcrossing of v (including ruin possibly at o), and let 


— Wy 
x(u) = oa O<usv. (1.16) 
Then 
1— pulu) H du(u)y(v), O<ucy, 
y(u) = a) u= v, 


1+ x(v) — y0)? 
n(u) + (Y® (u — v) — r(u)) yv), v< u< o. 


Proof. Recall from Proposition II.2.6 that ¢,(u) = 1 — Y, (u) is the probability 
for {RE} (and hence also for {R:}) of upcrossing level v before ruin given the 
process starts at u < v. Hence, for u < v the probability of ruin for {R;} will be 
the sum of the probability of being ruined before upcrossing v, 1 — ¢,(u), and 
the probability of ruin given we hit v first, ¢,(u)w(v). 


2The corresponding dividend payout scheme, namely to pay nothing when R; < v and to 
pay out at rate pı — pg when Rr > v, turns out to maximize the expected discounted sum of 
dividend payments until ruin under certain assumptions, see e.g. Gerber & Shiu [412]. 
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Similarly, if u > v then the probability of ruin is the sum of being ruined 
between g and the next upcrossing of v which is m(u), and the probability of 
ruin given the process hits v before (—oo, 0) again after o, 


(Pula < œ) = (u))¥(v) = (YO (u-v) — mu) d(r). 


This yields the expression for u > v, and the one for u = v then immediately 
follows. 


Example 1.13 Assume that B is exponential, B(x) = e~°*. Then 


yOu) = Ferm 


Day L enu 
w (u) = e a ‘p26 


pid 


9 9 


where y; = 6 — 3/p;, so that 


1 — Fes 
pulu) = — a. 
1— en” 

pio 


Furthermore, for u > v, P(o < 00) = y) (u—v) and the conditional distribution 
of v — Ro given ø < œ is exponential with rate ô. If v— Ro < 0, ruin occurs at 
time o. If v— Rs = x € [0, v], the probability of ruin before the next upcrossing 
of v is 1 — ¢,(v — x). Hence 


nu) = yP (u-— vjfe? + A (1 -= psv — x))de~**dar} 


Pe ee Ê -nlo-2) 
= Ê onlu) 1 | pis de>" dar 
p20 0 b 


1 — — e77 
p16 
1— eo p enn (e0158 = 1) 
— Ê ru) 4 am — 9) 
p26 1— He cna 
p16 


Siac ti li =} | 
p26 {= SES Sed 


pı 


Also for general phase-type distributions, all quantities in Proposition 1.12 
can be found explicitly, see IX.7. 
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1b Multi-step premium functions 


In a similar manner one can investigate a more general model with a premium 
rule of the form 
pı O=U <r <1, 
p2 USr<ve, 
pr) =4 (1.17) 
Pk T 2 Vk- 


Assume that p, > p. A similar approach as in Remark 1.11 then gives the 
piece-wise integro-differential equation 


TOE ES / “plu 2)dB(2) + 6B(u) = 0 (1.18) 


for y-1 < u < v; and i = 1,...,k — 1. For the solution y(u) to be continuous, 
we require the boundary conditions (contact conditions) 
lim y(u)= lim y(u) (1.19) 
u—>v;+t U>Vi- 


and from 7 > 0 we have 
lim w(u) = 0. (1.20) 


Note that y(u) is not differentiable at the boundaries of the layers, as in view 
of (1.18) and continuity of w the conditions (1.19) can be rewritten in the form 
or o- 
Pi+1 gut ve = Piz t oe 
For exponential claim distribution B(x) = e~*”, the system (1.18) can be solved 
explicitly in the following way. Akin to the procedure in Remark 1.11, first 
transform (1.18) into 


pil” (u) + (pid — B)y"(u) = 0, we [vi-1,%). (1.21) 
Using the notation 


k 


vu) = So Ibi < u < bi) y(u), (1.22) 


i=1 


where the function yl (u) is the solution of (1.21) for u € [vj-1,1;) for each 4, 


we obtain 
y(u) = AM + CMe—%, 
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where y; = ô — 3/p; is the (unique) positive solution of the Lundberg equation 
p (B[a] — 1) — pi a = 0. 

It remains to establish the constants in the above representation: from (1.20), 
we have A(*) = 0. From the continuity conditions (1.19) we immediately get 


ACHI) 4 OGVDe—H41% — AM — HeT = 0, (i=1,2,...,k— 1). (1.23) 


Using (1.18) and comparing the coefficients of e~°“, we further obtain after 
elementary algebra 


ges genie en AOO deto 6: EE E 
Ô — Vid ô— Ji ( ) 
1.24 
together with A® + C® 5*— = 1. Adding (1.23) and (1.24) leads to 
i ô— i i i 
min 0 = T exp {i = Wri} OM 
b — ô(p:) 1 1 (i) 
= ——— ex I= vip C 
P — 6(pi41) eiel z) } 
so that 
ô = i)V1 = 
Co = CE exp (y+ — y)u p CP 
(ô = 71) { oan ) 
B — d(p1) — 1 1 (1) 
= exp 4 8 age wre 
o V aa 
for i =1,...,k. Define 
i—l1 1 1 J 
L; = ye" ( — ) exp (ve-1 — ve) Ve p. (1.25) 
> Yi  Vj+i (2 } 


Then, again from (1.23), we have 
AO = A0400 (=A 4 A= AME; 
Yı 


and hence 
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Altogether we thus arrive at the explicit formula (1.22) with 


pli (u) = 1 (4: Dek (8 — qi) exp{ yu + Sor — w}) 


1- Lk yi Ji 
1 4 i—1 
1 i 
=E (z RF exp4 Lm — as)es pe w) 
for i = 1,..., k, where Y® (u) again denotes the ruin probability in the classical 


model with constant premium intensity p;(u) = pi. Note that for k = 2, the 
formula from Example 1.13 is retained. 


Remark 1.14 The main tool in the above calculation was the reformulation 
of the integro-differential equations as ordinary differential equations, which 
allowed to find the fundamental solution for each layer locally and separately and 
subsequently to determine the coefficients through the continuity assumptions 
between the solutions in different layers (through a system of linear equations). 
This program can still be carried out for, say, Erlang(n) claim sizes, in which 
case the ODEs (with constant coefficients) that generalize (1.21) are of order 
n +1. However, the solution of the resulting linear system of equations usually 
is highly involved and can only be evaluated numerically. 


Notes and references Some early references drawing attention to the model are 
Dawidson [277] and Segerdahl [790]. For the absolute ruin problem, see Gerber [396], 
Dassios & Embrechts [273] and, for a recent extension to finite-time horizons in a more 
general Lévy set-up, Loeffen & Patie [605]. 

Equation (1.6) was derived by Harrison & Resnick [450] by a different approach, 
whereas (1.5) is from Asmussen & Schock Petersen [104]; see further the notes to II.3. 

The analytic derivation of (1.14) can be found in Tichy [851]. For some explicit 
solutions beyond Corollary 1.9, see the notes to Section 2. 

Remark 1.10 is based upon Schock Petersen [695]; for complexity- and accuracy 
aspects, see the Notes to IX.7. Extensive discussion of the numerical solution of 
Volterra equations can be found in Baker [125]; see also Jagerman [499, 500]. An 
extension of Proposition 1.12, in which the switch from R® to RY only takes place 
if the risk process has gone below a threshold w < v first, but the switch from RY to 
RP still takes place at the upcrossing of v is given in Bratiychuk & Derfla [196], see 
also Frostig [377]. 

The special case p2 = 0 in the premium rule (1.15) refers to the situation when all 
original premium income is paid out as dividends to shareholders whenever the surplus 
level is above v. If in addition it is specified that for initial capital u > v, the difference 
u — v is immediately paid out as a lump sum dividend payment, then the resulting 
strategy is known as the horizontal dividend barrier strategy. The ruin probability is 
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(u) = 1 for the corresponding risk process (namely Rs reflected at v) and hence this 
case is not of interest for the main focus of this book. However, already back in 1957 
de Finetti [283] suggested to consider the expected discounted accumulated dividend 
payments until ruin in a portfolio as an (economically motivated) alternative to the 
ruin probability for measuring the value of a portfolio, and the identification of the 
optimal dividend strategy to maximize this quantity leads to challenging sto chastic 
control problems. The horizontal barrier strategy often turns out to be optimal (see 
Gerber [394] for early results; the weakest currently known criteria on the risk process 
under which horizontal dividend barrier strategies are optimal among all admissible 
payout strategies are due to Loeffen [604], Kyprianou, Rivero & Song [568] and Loeffen 
& Renaud [606]). For duality considerations of the reflected risk processes with G/M/1 
queues, see Lopker & Perry [609]. 

An analysis of exit problems for the threshold model (1.15) in a general Lévy set- 
up (with particular emphasis on the spectrally negative case) is given in Kyprianou & 
Loeffen [565]. 

The threshold and multi-step premium rules are a somewhat popular alternative 
to horizontal dividend strategies that are still to some extent analytically tractable 
and lead to ruin probabilities smaller than 1. The multi-step rule was first studied by 
Kerekesha [530], who looked at (1.18) for arbitrary u > 0 using correction terms for the 
different premium intensities outside the respective layer, which he expressed through 
truncated Fourier transforms. The explicit solution for exponential claims given above 
is due to Albrecher & Hartinger [25]; see also Zhou [919] and Lin & Sendova [594], 
where (1.18) is derived by a renewal approach that directly implies the continuity 
conditions (1.19). In [25, 594, 919] the analysis is also considerably extended to cover 
quantities like the time value of ruin, deficit at ruin and the surplus prior to ruin. To 
improve upon the problem described in Remark 1.14, an alternative recursive approach 
that iteratively calculates the full solution for the same model with one layer less is 
developed in [25]. For extensions of these results to the renewal model see Yang & 
Zhang [904]. 

In the literature also other surplus-dependent risk processes have been discussed 
in connection with dividend payout schemes that lead to a positive probability of 
survival. Among them are time-dependent barrier strategies, for which the barrier 
itself is an increasing function of time and if the risk process touches the barrier it 
stays at the barrier until the next claim occurs and the additional premium income 
is paid out as dividends. The ruin probability for the resulting risk process can then 
often be obtained as the solution of partial integro-differential equations, see Gerber 
[399], Siegl & Tichy [804, 805] and Albrecher & Tichy [41]. Albrecher, Hartinger 
& Thonhauser [26] analytically compare the performance of linear barrier strategies 
with the threshold strategies of (1.15). Albrecher & Kainhofer [30] and Albrecher, 
Kainhofer & Tichy [31] investigate risk processes with a non-linear time-dependent 
barrier structure including constant interest on the surplus. See also Garrido [390] in 
a diffusion setting. For general surveys on dividend models in risk theory, see Avanzi 
[109] and Albrecher & Thonhauser [40]. 
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2 The model with constant interest 


In this section, we assume that p(x) = p+ ix. This example is of particular 
application relevance because of the interpretation of i as interest rate. However, 
it also turns out to have nice mathematical features. 

A basic tool is a representation of the ruin probability in terms of a dis- 
counted stochastic integral 


Z= -f e dS; (2.1) 
0 


w.r.t. the claim surplus process S; = A; — pt = Ai U; — pt of the associated 
compound Poisson model without interest. Write Rw when Ro = u. We first 


note that: 
Proposition 2.1 R® = e*u+ RO. 


Proof. The result is obvious if one thinks in economic terms and represents the 
reserve at time t as the initial reserve u with added interest plus the gains/deficit 
from the claims and incoming premiums. For a more formal mathematical proof, 
note that 


dR™ = p+iR® —dA,, 
d[R® — e*u] = p+ i[R® = e*u] —dA;. 


Since RẸ — eu = 0 for all u, R® — etu must therefore be independent of u 
which yields the result. 


Let : 
Z = “RO = e” l +iRP )ds — A;). 
t t ( A (v ( ) sS i) 
Then 
t 
dZ, = e'(idt: f (p+iRO)ds+ (p+iRP)at+ idt- A, — d4) 
0 
= e *(pdt—dA,) = —e~“dS,. 
Thus 


Ly = — | e dS; $ 
0 


where the last integral exists pathwise because {6+} is of locally bounded vari- 
ation. 
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Proposition 2.2 The r.v. Z in (2.1) is well-defined and finite, with distribution 
H(z) = P(Z < z) given by the m.g.f. 


Hla] = Ee®? = opf | s (—ae~"’) at} = eof [ aku) dy}, 


where k(a) = 3(Bla] — 1) — pa. Further Z, > Z as t — oo. 


Proof. Let Mi = A; —tGup. Then Sı = M;+t(Gup — p) and {M,} is a 
martingale. From this it follows immediately that {fj e~"'dM;} is again a 


martingale. The mean is 0 and (since Var(dM;) = Bulat) 
v v (2) 
var f e "aM, ) = ik ot By at = oi =e; 
0 0 t 


Hence the limit as v — oo exists by the convergence theorem for D2-bounded 
martingales, and we have 


z = -fieras =f eM ans Ono- 
0 0 
Bm 1 e7" (AM, + (Bus — p)dt) 
0 


= -f e dS, = Z. 
0 


Now if X1, Xe,... are iid. with c.g.f. ọ and p < 1, we obtain the c.g.f. of 
LP ep" Xn at a as 


log Tl ere" Xn = log Il eble") — S lap"). 
n=1 n=1 n=l 


Letting p =e”, Xn = Snh — Sin+1)h; we have ¢(a) = hk(—a), and obtain the 
c.g.f. of Z = — fy e~#dS, as 


lim $- ġ(ap") = Ree ee) = ih k (—ae~*) dt; 


n=l 


the last expression for H [a] follows by the substitution y = ae™*. 


H(—u) l 
) | H(-R,(u)) | T(u) < oo] 


Theorem 2.3 y(u) = 
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Proof. Write T = T(u) for brevity. On {7 < co}, we have 
ut Z = (ut+Z,)+(Z-Z,) = e= [e (u see ot Í e™it-ndS, 
= oR + 24) 


where Z* = — SE e™i@t-T)dS, is independent of ¥, and distributed as Z. The 


last equality followed from Rw = e(Z, + u), cf. Proposition 2.1, which also 
yields 7r < œ on {Z < —u}. Hence 


H(-u) = P(u+Z<0) = P(R, + Z* <0; 7 < oœ) 
v(u)E[P(R, + Z* <0| F,,7 < 00)] 
= y(u) [H(—R,(u)) | T(u) < oo]. 


II 


Corollary 2.4 Assume that B is exponential, B(x) = e~**, and that p(x) = 
p+ix with p>0. Then 
perry (P+) 3 


pu) = m (2.2) 
§8/inB /ig—Sp/i + sist —ap(2, i 
2 (7 


where T(x; n) = [°° t”~1e~'dt is the incomplete Gamma function. 


Proof 1. We use Corollary 1.9 and get 


ce | 1 1 
= dt = —1 Hi l 
wa) = | Sagat = Foel + ia) - Flog, 
glz) = 10 exp{ É tog(p + ia) -  logp — 6x} 
pre a a 
= o (p ije ix) hee 
p 
1 4 B 
— = l+ —— exp, w(x) — dx} dx 
7 aa ae 
= i+ f -É (p+ in) 8/10 F da 
o P 
= g © Bli-19g—5(y—P)/i 
= 1 son f y e dy 
=i A 
= 58/inB/t i’ as? 
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vu) = y I Say exp (Bela) — åz) dx 
3i8/*—1eSp/i 


d(p + iu) a 


from which (2.2) follows by elementary algebra. 
Proof 2. We use Theorem 2.3. From x(a) = Ba/(d — a) — pa, it follows that 


log ja] = [ irna = Ef e-Bay 


l 


II 


E Pes lose Lal santo [ere i 7 
[pa g g(ô+a)] = loge ie l 


which shows that Z is distributed as p/i — V, where V is Gamma(ô, 8/i), i.e. 
with density 


NE ae noes —ôx 
fv(£) = TUA ~ > eS. 0: 
In particular, 
H(-u) = P(Z < —u) = P(V >u+p/i) = P (8(p + iu) /i B/i) 


T (8/i) 


By the memoryless property of the exponential distribution, —R,(,) has an 
exponential distribution with rate (ô) and hence 


J [H(—Rru))| T(u) < 00] 


j de °* P(p/i —V < x) da 
0 


= [-e-* P(p/i -V< alo + 3 eo fy(p/i = z) di 


p/i (pJi — x)f/i=168/i 


= P(V > p/i) + f rgy ed 
7 1 ay (p/i)8/*58/*——Sp/i 
= ream ener + PO 


From this (2.2) follows by elementary algebra. 


Proof 8. Just insert p(x) = p+iz into (1.14) and identify the resulting integral 
as the incomplete Gamma function. 
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Example 2.5 The analysis leading to Theorem 2.3 is also valid if {R,} is ob- 
tained by adding interest to a more general process with stationary indepen- 
dent increments. As an example, assume Brownian motion {B;} with drift 
u and variance o°; then {R+} is the diffusion with drift function u + ix and 
constant variance o°. The process {S+} corresponds to {—B;} so that x(a) = 
o7a?/2— pa, and the c.g.f. of Z is 


A wi 1 fe oy 
log Hla] = f —K(-y)dy = 7 (+u) 
o y 2 Jo 
o°a? pa 
4i i 


Le., Z is normal (u/i, oa? /2i), and since R, = 0 by the continuity of Brownian 
motion, it follows that the ruin probability is 


—u-p/i 
H(—u) _ a( Te ) 


Notes and references Theorem 2.3 is from Harrison [449]; for a martingale proof, 
see e.g. Gerber [398, p. 134] (the time scale there is discrete but the argument is easily 
adapted to the continuous case). Corollary 2.4 is classical. Formula (2.3) was derived 
by Emanuel et al. [342] and Harrison [449]; it is also used as a basis for a diffusion 
approximation by these authors. 

Paulsen & Gjessing [687] found some remarkable explicit formulas for y(u) beyond 
the exponential case in Corollary 1.9. The solution is in terms of Bessel functions 
for an Erlang(2) B and in terms of confluent hypergeometric functions for a Hə B (a 
mixture of two exponentials). It must be noted, however, that the analysis does not 
seem to carry over to general phase-type distributions, not even Erlang(3) or H3, nor 
to non-linear premium rules p(-). 

Explicit formulas for the finite-time ruin probabilities ~(u,T) for exponential 
claims in terms of finite gamma series were obtained in Albrecher, Teugels & Tichy 
[39] whenever 8 = ki for some integer k. See also Knessl & Peters [548] for a detailed 
asymptotic study of w(u,T) for i > 0 and exponential claims. Avram, Leonenko & 
Rabehasaina [110] extend the method of [39] to certain jump-diffusion models. A nu- 
merical algorithm for determining y(u, T) based on discrete time Markov chains can 
be found in Cardoso & Waters [221, 222]. 

A r.v. of the form Er p” Xn with the Xn i.i.d. as in the proof of Proposition 2.2 
is a special case of a perpetuity; see e.g. Goldie & Griibel [422] and Section 5. 
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Further studies of the model with interest can be found in Boogaert & Crijns 
[180], Gerber [396], Delbaen & Haezendonck [288], Emanuel et al. [342], Paulsen 
[680, 681, 682], Paulsen & Gjessing [687], Sundt & Teugels [822, 823], Yang [899], Cai 
& Dickson [215] and Rulliére & Loisel [758]. Some of these references also go into a 
stochastic interest rate. 


3 The local adjustment coefficient. Logarithmic 
asymptotics 


For the classical risk model with constant premium rule p(x) = p*, write y* for 
the solution of the Lundberg equation 


A(Bly*]-1) -7 = 0, (3.1) 
write ~*(u) for the ruin probability etc., and recall Lundberg’s inequality 
w*(u) < e™™ (3.2) 
and the Cramér-Lundberg approximation 
y*(u) ~ Cte", (3.3) 


When trying to extend these results to the model of this chapter where p(x) 
depends on zx, a first step is the following: 


Theorem 3.1 Assume that for some 0 < ðo < œ, it holds that Bis] T œœ, 


l 
s T ôo, and that p(x) > œ, x > œ. Then lim sup fog ut) < -—d9. If o < co 
= l 
and e~*"p(r) — 0, et)" B(x) —> oo for alle > 0, then log H(u) > —60, 
u — oo. u 


In the proof as well as in the remaining part of the section, we will use the local 
adjustment coefficient y(x), which for a fixed x is defined as the adjustment 
coefficient of the classical risk model with p* = p(x), i.e. as the solution of the 
equation 


K(x,7(x)) =0 where K(x,a) = B(Blo] — 1) — ap(z). (3.4) 


We assume existence of y(x) for all x, as will hold under the steepness assump- 
tion of Theorem 3.1, and (for simplicity) that 


w p(z) > Bus, (3.5) 
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which implies infz>0 y(x) > 0. The intuitive idea behind introducing local 
adjustment coefficients is that the classical risk model with premium rate p* = 
p(x) serves as a ‘local approximation’ at level x for the general model when the 
reserve is close to x.’ 


Proof of Theorem 3.1. The steepness assumption and p(x) — oo ensure y(x) > 
do. Let y* < do, let p* be as in (3.1) and for a given e > 0, choose uo such 
that p(x) > p* when x > uoe. When u > ug, obviously y(u) can be bounded 
with the probability that the Cramér-Lundberg compound Poisson model with 
premium rate p* downcrosses level ue starting from u, which in turn by Lund- 
berg’s inequality can be bounded by e~7 0-9, Hence lim sup, „o log y(u) /u 
< -—7*(1— €). Letting first € — 0 and next y* Î ðo yields the first statement of 
the theorem. 

For the last assertion, choose c),?) such that plx) < Mere, B(x) > 
cP eoe) for all x. Then we have the following lower bound for the time for 
the reserve to go from level u to level u + v without a claim: 


i 1 
w(u+v)—w(u) = f — dt > eT" 
SS DEE 


where c® = (1 — e7™®)/ (ec™). Therefore the probability that a claim arrives 
before the reserve has reached level u + v is at least cDereu 


arrival, ruin will occur if the claim is at least u + v, and hence 


. Given such an 


plu) > Cette) e Wore, 


The truth of this for all e > 0 implies lim inf log y(u) > —dp. 


Obviously, Theorem 3.1 only presents a first step, and in particular, the 
result is not very informative if dg = oo. The rest of this section deals with 
tail estimates involving the local adjustment coefficient. The first main result 
in this direction is the following version of Lundberg’s inequality: 


Theorem 3.2 Assume that p(x) is a non-decreasing function of x and let I(u) = 
Jo y(a) da. Then 


w(u) < e 7M), (3.6) 


The second main result to be derived states that the bound in Theorem 3.2 
is also an approximation under appropriate conditions. The form of the result 
is superficially similar to the Cramér-Lundberg approximation, noting that in 
many cases the constant C is close to 1. However, the limit is not u — oo but 


3Note that this was also the motivation behind the approach of Section 1b. 
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the slow Markov walk limit in large deviations theory (see e.g. Bucklew [207]). 
For € > 0, let Ye (u) be evaluated for the process {ROY defined as in (1.2), only 
with 8 replaced by 3/e and U; by «Uj. 


Theorem 3.3 Assume that either (a) p(r) is a non-decreasing function of r, 
or (b) Condition 3.13 below holds. Then 


lim —€ log ye (u) = I(u). (3.7) 


Remarks: 


1. Condition 3.13 is a technical condition on the claim size distribution B, 
which essentially says that an overshoot r.v. U|U > x cannot have a much 
heavier tail than the claim U itself. 


2. If p(x) = p is constant, then Ro = éR;je for all t so that pe(u) = W(u/e), 
Le., the asymptotics u — co and e — 0 are the same. 


3. The slow Markov walk limit is appropriate if p(x) does not vary too much 
compared to the given mean interarrival time 1/8 and the size U of the 
claims; one can then assume that € = 1 is small enough for Theorem 3.3 
to be reasonably precise and use e~/“) as approximation to w (u). 


4. As typical in large deviations theory, the logarithmic form of (3.7) only 
captures ‘the main term in the exponent’, but is not precise to describe 
the asymptotic form of y(u) in terms of ratio limit theorems (the pre- 
oe Opie could be log I(u)e™7®) or I(u)%e *™), say, rather than 
el! u ). 


3a Examples 


Before giving the proofs of Theorems 3.2, 3.3, we consider some simple examples. 
First, we show how to rewrite the explicit solution for y(u) in Corollary 1.9 in 
terms of J(u) when the claims are exponential: 


Example 3.4 Consider again the exponential case B(x) = e~*” as in Corollary 
1.9. Then q(x) = 6 — B/p(x), and 


[az = ðu- 8 f ple) ax = ôu — w(u). 
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Integrating by parts, we get 


E aia ES w(x) — dx} dx 
a LED gge- day a 


= 1+ L telo) exp{ w(x) — dx} dx 


= 1+ [exp{@u(z) - Lyo + af exp{ Gu(x) — dx} dx 


= 14+0-1+ af e dg, 
0 


| 


1 co co B 

mad g(a)dx = 1 aa e) — ôx} dz 
= fexp{Sw(z) — dx}]~° + öf exp {Sw(a) — dx} dx 
= af exp{ w(x) — ôx} dz — exp{ Bw(u) — ôu}, 


and hence 


[Fe A E T a a a 
=e 


foe ely) dy ie e7 So y(x)dax dy 


y(u) = . (3.8) 


We next give direct derivations of Theorems 3.2, 3.3 in the particularly simple 
case of diffusions: 


Example 3.5 Assume that {R+} is a diffusion on [0,00) with drift u(x) and 
variance o?(x) > 0 at x. The appropriate definition of the local adjustment 
coefficient y(x) is then as the one 2u(x)/o?(x) for the locally approximating 
Brownian motion. It is well known (see Theorem XIII.4.4 or Karlin & Taylor 
[522, pp. 191-195]) that 


œ .-I(y) © a [Y y(at+u)da 
Wee a ly ee ay, (3.9) 


y(u) = JE dy SE Er Sg v(@)dx dy 


If y(x) is increasing, applying the inequality y(x +u) > y(x) yields immediately 
the conclusion of Theorem 3.2. For Theorem 3.3, note first that the appropriate 
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slow Markov walk assumption amounts to p(x) = u(x), o2(x) = eo? (x) so that 
Yelx) = 7(x)/e, Ie(u) = I(u)/e, and (3.9) yields 


-eloge (u) = I(u) + Ae- Be (3.10) 
where 
A= clog( f e7 Jo V(@)da/e dy), B. = €log (| e7 Jo y(a+u)da/e dy). 
0 0 


The analogue of (3.5) is infs>o y(x) > 0 which implies that the integral in the 
definition of A, converges to 0. In particular, the integral is bounded by 1 
eventually and hence lim sup A, < limsupelog1 = 0. Choosing yo, yo > 0 such 
that y(x) < yo for y < yo, we get 


oa y o € € 
| e7 Jo V(@)dx/« dy > 1 e72 dy = (Pel) v 
0 0 Yo Yo 


This implies liminf A, > limeloge = 0 and A, — 0. Similarly, Be — 0, and 
(3.7) follows. 


The analogue of Example 3.5 for risk processes with exponential claims is as 
follows: 


Example 3.6 Assume that B is exponential with rate 6. Then the solution of 
the Lundberg equation is y* = 6 — 3/p* so that 


I(u) = ju 6 f iyaa: 


Note that this expression shows up also in the explicit formula for y(u) in the 
form given in Example 3.4. Ignoring 1/6 in the formula there, this leads to (3.6) 
exactly as in Example 3.5. Further, the slow Markov walk assumption means 
ôe = ô/€, Be = B/e. Thus ye(x) = (a) /e and (3.10) holds if we redefine A, as 


A, = clog( | e Je Waddel<dy — ¢/8) 
0 


and similarly for Be. As in Example 3.5, 


limsup A, < limsupelog(1 — 0) = 0. 


e—0 e—0 


By (3.5) and 7* = ô — B/p*, we have ô > yo and get 


liminf A, > lim elog (e(— ~ 5)) > 0. 


Now (3.7) follows just as in Example 3.5. 
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We next investigate what the upper bound (or approximation) e~/“) looks 
like in the case p(x) = p + ix (interest) subject to various forms of the tail 


B(x) of B. Of course, y(x) is typically not explicit, so our approach is to 


determine standard functions Gi(w),...,G (wu) representing the first few terms 
in the asymptotic expansion of I(u) as u — oo. L.e., 
Gizi(u) _ = 
G;(u) > œ, Gi(u) =0(1), I(u) = Gi(u) +--- + Glu) + 0(G,(u)). 


It should be noted, however, that the interchange of the slow Markov walk limit 
e — 0 and the limit u — oo is not formally justified and in fact, the slow Markov 
walk approximation deteriorates as x becomes large. Nevertheless, the results 
are suggestive in their form and much more explicit than anything else in the 
literature. 


Example 3.7 Assume that 
B(x) ~ ete, g> 00 (3.11) 


with œ > 0. This covers mixtures or convolutions of exponentials or, more 
generally, phase-type distributions (Example I.2.4) or gamma distributions; in 
the phase-type case, the typical case is a = 1 which holds, e.g., if the phase 
generator is irreducible (Proposition IX.1.8). It follows from (3.11) that B[s] > 
oo as s Î 6 and hence * Î 6 as p* — co. More precisely, 


Bis] = 1+s -Bleda = 1+ 7—1 +01) 
0 (ô — s) 


as s Î 6, and hence (3.1) leads to 


aw paT (a) 
p* 


, xð ep, C2 = (Bcr (a)) , 


a7") 
ôu a<l 


be 1 
I(u) ~x ôu e | - zdr ~ ôu — czlogu a= , 
o (p+iz)"/ bu—cyui-V/e asl 


where c3 = C2/i, c4 = cgi /%/(1 — 1/a). 


Example 3.8 Assume next that B has bounded support, say 1 is the upper 
limit and 


Biz) ~ (l-r), «xf 1, (3.12) 
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with 7 > 1. For example, 7 = 1 if B is degenerate at 1, 7 = 2 if B is uniform 
on (0,1) and 7 = k + 1 if B is the convolution of k uniforms on (0,1/k). Here 
Bjs] is defined for all s and 


Bis] —1 


II 


1 8 
| ef B(x) dz = e f e YB(1—y/s) dy 
0 0 


Pe c5e° © ety" lay ES cse T (n) 
sn—1 0 gn-l 


as s T oo. Hence (3.1) leads to Bese” T(n) ~ y*"p*, 


* 


y* ~ logp* +nloglogp*, I(u) % u(logu + nņloglogu). 


Example 3.9 As a case intermediate between (3.11) and (3.12), assume that 
B(x) ~ cee? mtoo. (3.13) 
We get 


Bis] —1 


Q 


oo oo 
2 2 2 
cos | este? /2c7 dz cose’ e e7 (@-€78) /2c7 dz 
0 0 


= ces 2rcre™ 2a (2) ~ cesy 2rcre® s /2, 
Va 


at ~ log p*, y“ ~ Cg log p*, I(u) x cgurv/ log u, 


where cg = \/2/cz. 


3b Proof of Theorem 3.2 


We first remark that the definition (3.4) of the local adjustment coefficient is 
not the only possible one: whereas the motivation for (3.4) is the formula 


1 
h 
for the m.g.f. of the increment in a small time interval [0, h], one could also have 
considered the increment r,,(T;) —u—Uj, up to the first claim (here r,,(-) denotes 


the solution of * = p(r) starting from r,,(0) = u). This leads to an alternative 
local adjustment coefficient yo(u) defined as solution of 


log E, e- ~ 6(Bls] —1)— splu), h | 0, (3.14) 


1 = Eero Uitu-ru(T1)) = Bhool f Be Pt evo(u)(u—rult)) gt, (3.15) 
0 
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Proposition 3.10 Assume that p(x) is a non-decreasing function of x. Then: 
(a) y(x) and yo(x) are also non-decreasing functions of x; 


(b) y(x) < yo(2). 


Proof. That y(x) is non-decreasing follows easily by inspection of (3.4). The 
assumption implies that r,,(¢) — u is a non-decreasing function of u. Hence for 
uU <v, 


1 = Ee% Uitu-ru(Ti)) > Relu) Uitv-re(T1)), 


By convexity of the m.g.f. of Uj + v — r,(Tı), this is only possible if yo(v) > 
y(u). 
For (b), note that the assumption implies that r,,(t) — u > tp(u). Hence 


1 = Ee Uitu-ru(Ti)) < Per) (UVi-plu)T:) 
= B 
= B u) =, 
helgen 
0 < B(Blro(u)] — 1) — 0(u)p(u). 


Since (3.4) considered as function of y is convex and equals 0 for y = 0, this is 
only possible if yo(u) > y(u). 


We prove Theorem 3.2 in terms of yo; the case of y then follows immediately 
by Proposition 3.10(b): 


Theorem 3.11 Assume that p(x) is a non-decreasing function of x. Then 
lu) < e7 Jo role) de, (3.16) 


Proof. Define Yy™ (u) = P(r(u) < on) as the ruin probability after at most n 
claims (on = Ti +--+ Th). We shall show by induction that 


Y™ (u) < e Jo ole) ae (3.17) 


from which the theorem follows by letting n — oo. The case n = 0 is clear since 
here Ty = 0 so that y(u) = 0. Assume (3.17) shown for n and let F,,(a) = 
P(U; +u- ra(Tiı) < z). Separating after whether ruin occurs at the first claim 
or not, we obtain 


per (u) 
< 1-F,(u)+ T wy (u — x)F, (dx) 


‘i F, (dx) + i; eT h * wad p (dar) 


= e-f wlx)ae { / © ele! WWE (de) + / 


—Co 


IA 


ehins wad p, (av) 
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Considering the cases x > 0 and x < 0 separately, it is easily seen that 
Sia yoly)dy < zry(u). Also, te yo(y)dy < uyo(u) < xy0(u) for x > u. Hence 


prt) (u) < e Jo w(a)d ta er o(u F,,(dx) + + ew) F, (de) } 


n 


= efo yo(a)d 2 Bly o(u)] 


e7 Jo Yo(w) dex 


where the last identity immediately follows from (3.15); we used also Proposition 
3.10(a) for some of the inequalities. 


It follows from Proposition 3.10(b) that the bound provided by Theorem 
3.11 is sharper than the one given by Theorem 3.2. However, yolu) appears 
more difficult to evaluate than 7(u). Also, for either of Theorems 3.2, 3.11 be 
reasonably tight something like the slow Markov walk conditions in Theorem 3.3 
is required, and here it is easily seen that yolu) ~ y(u). For these reasons, we 
have chosen to work with y(u) as the fundamental local adjustment coefficient. 


3c Proof of Theorem 3.3 


The idea of the proof is to bound {RO} above and below in a small interval 
[|x —a/n,x + z/n] by two classical risk processes with a constant p and appeal 
to the classical results (3.2), (3.3). To this end, define 


k A 
Ukn = —U, Pen = sup plz), Pyn = inf p(z), 
n Uk—1,n LT LUk+1,n Uk—1,n LTLUk+1,n 


and, in accordance with the notation ype (u), wp. (u), let YF... (u) denote the 
ruin probability for the classical model with 8 replaced by 3/e and U; by Uj. 


Lemma 3.12 limsup,j) —€log Ye (u) < I(u). 


Proof. For ruin to occur, {RO} (starting from u = up») must first downcross 
Un—1,n The probability of this is at least ø% (u/n), the probability that 
ruin occurs in the Cramér-Lundberg model with p* = Prin (starting from u/n) 
without that 2u/n is upcrossed before ruin. Further, given downcrossing occurs, 


the value of {RO> at the time of downcrossing is < Un—1,n so that 


pelu) > Dy. nie (u/n) Ve (Un—1,n) 
Pp, nie U/M) bp, 1 nje (U/M) Ye (Un—2,n) 


> Iss ner (u/n). 


IV 


IV 
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Now as e | 0, i 
Pore (U) = pir (u/e) ~ C*e77 ure 
where the first equality follows by an easy Sealine argument and the approx- 


imation by (3.3). Let Ck,n, Yen be C*, resp. y* evaluated for p* = Pp n; in 
particular, since y* is an increasing function of p*, also 


Tkn = sup (a). 


Uk—1,nLTLUk,n 
Clearly, 
Vprye (2u/n), 
Vix mie (U/M) — Vh, nie (2u/m) 
A Cae ~Fenulen (1 — é Te muen) 
= Cyne en™/™ (1 + o(1)), 


IV IA 


where o(1) refers to the limit e | 0 with n and u fixed. It follows that 


-logy (u) < -Jolo 45 (u/n) 


II 


A = heer + o(1), 


k=1 


lim sup —elogy. (u) < oe en 
eļ0 


Letting n — oo and using a Riemann sum approximation completes the proof. 


Theorem 3.3 now follows easily in case (a). Indeed, in obvious notation one 
has ye(x) = y(x)/€, so that Theorem 3.2 gives 


pe (u) < 0 > lim inf —elog te (u) > I(u). 
Combining with the upper bound of Lemma 3.12 completes the proof. 


In case (b), we need the following condition: 


Condition 3.13 There exists a r.v. V < co such that (i) for any u < co there 
exist Cu < œ and 6(u) > SUPs<u (x) such that 


P(V >£) < CyeO™; (3.18) 
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(ii) the family of claim overshoot distributions is stochastically dominated by V, 
i.e. for all x,y > 0 it holds that 


Bot) < pysy). (3.19) 


P(U >z+y|U >z) = Ba) = 


To complete the proof, let v < u and define 


79) (u,v) = inf{t >0: RO <v| RE =u}, EO (u,v) =v — RO 


T(E) (u,v) 
Then 
pe (u) 
= Ely (Rew (u, ae ; T9 (u, E < 00] 
= i [be(u/n — £“ (u,u/n)); TO (u,u/n) < oo] 
= > (u/n —€& (u,u/n)) |r (9 (u, u/n) < oo] P(r (u, u/n) < 00) 
< Ey, (u/n—€V)- (u u/n ) < œ). 


Write Ey. (u/n — eV) = E + E2, where E is the contribution from the event 
that the process does not reach level 2u/n before ruin and E» is the rest. Then 
the standard Lundberg inequality yields 


Ey < Ey, ..(u/n—eV)=Ey,  (u/en—V) 


Pin Pin 


< Page ea 


r [evn V <u/en| +P(V > u/en) 
= @ Un"! (1) 


(using (3.18) for the last equality). For Ez, we first note that the number of 


downcrossings of 2u/n starting from R© 


N with 


= 2u/n is bounded by a geometric r.v. 


1 infy>2u/n p(x) 
IN < m S F m T o(1), 
t= Vee ie p(x);e (0) infe>2u/n P(t) — BEU 


cf. (3.5) and the standard formula for 7)(0). The probability of ruin in between 
two downcrossings is bounded by 


abt. (2Qu/n—eV) = ein! O(1) 


Pin 


so that 2 
Eo < e?n" PON), Bees ba" OW), 
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Hence 


Liat —elog pe (u) 


> lim inf —e{log( Ei + E2) + log P(r (u, u/n) < c)} 
u Eak € 
> meee Hia clog P(r‘ (u ujn) < oo) 


n 
u 
TA 


Another Riemann sum approximation completes the proof. 


Notes and references With the exception of Theorem 3.1, the results are from 
Asmussen & Nielsen [92]; they also discuss simulation based upon ‘local exponential 
change of measure’ for which the likelihood ratio is 


L; = exp{— f 1-an} = exp{— f ARR) as + SORU}: 


An approximation similar to (3.7) for ruin probabilities in the presence of an upper 
barrier b appears in Cottrell et al. [263], where the key mathematical tool is the deep 
Wentzell-Freidlin theory of slow Markov walks (see e.g. Bucklew [207]). Djehiche [324] 
gives an approximation for w(u, T) = Pu (info<t<r Rt < 0) via related large deviations 
techniques. Comparing these references with the present work shows that in the slow 
Markov walk set-up, the risk process itself is close to the solution of the differential 
equation 

*(e) = —r'(z,0) (= p(x) - BEU) (3.20) 
(with «(x,s) as in (3.4) and the prime meaning differentiation w.r.t. s), whereas the 
most probable path leading to ruin is the solution of 


F(a) = —K'(2,7(z2)) (3.21) 


(the initial condition is r(0) = u in both cases). Whereas the result of [324] is given 
in terms of an action integral which does not look very explicit, one can in fact arrive 
at the optimal path by showing that the approximation for y(u, T) is maximized over 
T by taking T as the time for (3.21) to pass from u to 0; the approximation (3.7) 
then comes out (at least heuristically) by analytical manipulations with the action 
integral. Similarly, it might be possible to show that the limits e | 0 and b fî co 
are interchangeable in the setting of [263]. Typically, the rigorous implementation of 
these ideas via large deviations techniques would require slightly stronger smoothness 
conditions on p(x) than ours and conditions somewhat different from Condition 3.13, 
the simplest being to require Bis] to be defined for all s > 0 (thus excluding, e.g., the 
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exponential distribution). We would like, however, to point out as a maybe much more 
important fact that the present approach is far more elementary and self-contained 
than that using large deviations theory. For different types of applications of large 
deviations to ruin probabilities, see XIII.3. 

Asymptotic results for surplus-dependent premium under heavy-tailed claims are 
given in Section X.5. 


4 The model with tax 
Consider now a compound Poisson risk process R®) with constant premium 
income intensity (1 — V)p (0 < V < 1) whenever the risk process is in its 
running maximum Mo) = max{ R, O<s< t} and constant premium income 
intensity p otherwise. So the dynamics of the risk process are given by 
(8) pdt — dA, if RO < MO, 
dR; = «¢ p(B) (8) (4.1) 
(l-V)pdt-—dA,, fR =M,’, 


where A; = bala U; are the aggregate claims up to time t. As outlined in Ex- 
ample 1.4, a natural interpretation for this model is that the insurance company 
needs to pay tax at rate V whenever the risk process is at a new record height 
(which is considered to be profit) and does not need to pay tax if it is below 
the running maximum, as then the incoming premium is needed to amortize 
the previous claim payments until a new running maximum is reached. But one 
can also simply think of a profit-participation in terms of dividend payments 
to shareholders according to the above scheme. Figure VIII.2 depicts a sample 
path of the resulting risk process. 


A R, 


FIGURE VIII.2 


The resulting ruin probability y(u) has a strikingly simple relation to the 
ruin probability ~(u) = Wo(u) of the original risk process. 
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Theorem 4.1 In case of positive safety loading n > 0 and 0 < 1, pylu) < 1 
holds for all u > 0. In particular, in this case 


1— palu) = (1—Yo(w) OO. (4.2) 


Proof. Recall from Theorem II.2.3 that the survival probability d(u) = 1— y(u) 
in the classical compound Poisson risk model can be interpreted as the prob- 
ability to have zero events during [u,oo) of an inhomogeneous Poisson process 
with rate G(s) = B/p P(Vinax > s), where Vmax denotes the maximum work- 
load of an M/G/1 queue.* But now one realizes that the survival probability 
with tax y(u) can also be interpreted in a similar way, since when we cut 
out the excursions away from the running maximum that do not lead to ruin 
(which are identical to those witho ut tax), then a straight line with slope 
p(1 — #) remains. After rescaling to slope 1, the probability to survive is to 
have no events during [u, co) of the inhomogeneous Poisson process with rate 
B(s) = B/ (pC — 8)) P(Vinax > 8), where Vmax is again the maximum workload 
of the original M/G/1 queue. In view of (2.5) from Chapter II, this leads to 


polu) = eo- a P(Vmnax > 8) ds) = CAO , 


Remark 4.2 If one defines ¢y(u,v) for u < v as the probability that, starting 
from level u at time 0, the process reaches level v before ruin occurs (clearly 
polu) = d9(u,co)), then — along the same line of arguments as above — we 
have 


polu, v) = e(z P(Vmax > 8) ds) = (golu, v) YEP. (4.3) 


It is now straightforward to extend identity (4.2) to a surplus-dependent tax 
rate. 


Corollary 4.3 If the tax rate V(r) (0 < V(r) < 1) depends on the current 
surplus level Ry = r, then the corresponding survival probability ép(u) is given 


by P 
prlu) = exp(- f (£ iog ¢0(s)) rgt) (4.4) 


4In Theorem II.2.3 we had p = 1. Here we have a general p, so we first need to rescale time 
by the factor p, i.e. the Poisson process with intensity 3/p and the new premium rate 1 leads 
to the same ¢(w) as the original process, but now can be linked with the M/G/1 queue. 
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Proof. In this case ¢p(u) is the probability to have zero events during [u, co) of an 
inhomogeneous Poisson process with rate (s) = 3/(p(1 — 8(s))) P(Vinax > 8), 
where Vmax iS again the maximum workload of a busy period in a classical 
M/G/1 queue. Now just again invoke Theorem II.2.3. 


Remark 4.4 From (4.4) one sees that even in the case V(r) — 1 as r > oo 
there can be a positive probability of survival, as long as the convergence rate 
is sufficiently low. 


Consider finally a further generalization of the risk process given by the 
surplus-dependent dynamics 


g_ J pi(RZ)dt-—dA,, if R? < Mj, 
ite = { po(R2)dt—dA,, if RP = MP. (2) 
Denote with q(u) the probability that if a claim occurs in the running maximum 
u, then ruin occurs before RI reaches level u again (note that q(u) depends on 
pi(-), but not on po(-)). Then, the same reasoning as above (where now q(u) 
takes the role of P(Vinax > u)) shows that the survival probability is given by 


(u) = exp(-3 f ae ds) i (4.6) 


It immediately follows that if po(s) = (1 — V)pi(s), then the tax identity (4.2) 
holds again, where wWo(u) then refers to the ruin probability of the risk process 
with premium rule p;(x) for all x > 0. In particular, this shows that the tax 
identity also holds for the compound Poisson risk process with interest (where 
pi(z) = p + ix) discussed in Section 2. As a by-product, one obtains with 
pı(x) = po(x) = p + ix that the ruin probability of the classical risk process 
with constant interest (without tax) can be expressed as 


w(u) =1 exp( ef we. ds) ; 


which can be compared with (1.6). It then remains, however, to identify an 
explicit expression for the quantity q(u). 


Notes and references The tax model was introduced in Albrecher & Hipp [28], 
where the identity (4.2) was derived by a different approach. The simpler proof given 
here is based on Albrecher, Borst, Boxma & Resing [16], where also the treatment of 
the surplus-dependent tax rate can be found. Albrecher, Renaud & Zhou [35] used 
excursion theory to extend the identity (4.2) to arbitrary spectrally negative Lévy 
processes and also generalized a formula from [28] for the moments of the accumulated 
discounted tax payments until ruin in terms of scale functions. Kyprianou & Zhou [569] 
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then further extended this approach to surplus-dependent tax rates and quantities like 
the time of ruin, the deficit at ruin and the surplus prior to ruin in the tax model; see 
also Renaud [736]. A more direct analysis of these latter quantities in the compound 
Poiss on setting with tax can be found in Ming & Wang [645] and absolute ruin under 
the tax payment scheme is studied in Ming, Wang & Xao [646]. 

A treatment of the tax problem for the model (4.5) can be found in Wei [878], 
where (4.6) is derived by a direct differential argument and in Wang et al. [868], where 
the accumulated discounted tax payments until ruin is considered. For an extension to 
a Markov-modulated model, see Wei, Yang & Wang [877]. A slightly different model is 
considered in Hao & Tang [448], where the authors give a fine asymptotic study of the 
ruin probabilities of a spectrally negative Lévy risk model that is subject to periodic 
taxation on its net gains during each period. 

The effect of tax payments on a non-Markovian risk process is investigated in 
Albrecher, Badescu & Landriault [15] in the context of the dual risk model, in which 
the sign of the premium income and the aggregate claims are reverted. The simple 
tax identity (4.2) then does not hold anymore, but a similar relationship holds for 
arbitrary interarrival times and exponential jump sizes. 


5 Discrete-time ruin problems with stochastic 
investment 


We consider a discrete-time risk reserve process Ro, Ri,... given by Rp = u > 0 
and the recursion 
Rn = AnRn-1—- Bn, (5.1) 


where {An}, {Bn} are independent sequences each consisting of i.i.d. r.v.’s and 
An > 0. The interpretation is that the reserve is invested in risky assets yielding 
a stochastic interest rate of A, — 1 in period n, whereas B,, is the claim surplus, 
that is, the difference between claims and premiums received. The reserve may 
decrease if either A, < 1 or B, > 0, so that financial risk enters via the 
A, and traditional insurance risk via the Bn. As usual, the ruin time 7(u) 
corresponding to Ro = u is the first n with Rn < 0, and the ruin probability 
(u) is the probability that Rn < 0 for some n. To avoid trivialities, we assume 
P(B, > 0) > 0 since otherwise y(u) = 0 for all u > 0, and also P(A, < 1) > 0 
since otherwise there is no investment risk. 

A first question is when y(u) = 1 for all u and when not. One could expect 
the first possibility to occur when EA, < 1 but not when EA; > 1 (then 
one would expect Rn — co with positive probability). However, the relevant 
criterion is in terms of E log A,: 


Proposition 5.1 Assume Elog|B,| < œ. If Elog A, < 0, then y(u) = 1 for 
allu>0. If Elog A, > 0, then y(u) <1 for all large u > 0. 
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Proof for Elog An < 0. 
First note that for R,_, = x > 0, we have 


log R} = log A, +logx+log(1—B,/x)t < log An +log x+log (1+|Bn|/z) . 


By assumption, there exists xo such that Elog A, + Elog (1+|B,|/x) < 0 for 
all a > ao. Le., 


Eflog R} — log a* | Rn—1 = 6) <0 


for all x > xo. By standard recurrence criteria for Markov processes ([APQ, p. 21), 
this implies that R, cannot go to oo so that some interval of the form [0,21 
is visited i.o. by {Ri}. Since (unless for trivial cases) info<u<x, Y(u) > 0, a 
geometric trial argument therefore gives y(u) = 1. 

The proof for the case Elog An > 0 will be given after Proposition 5.3. 


In view of Proposition 5.1, we henceforth assume E log |B,,| < oo and E log A, 
> 0. This implies in particular P(A, > 1) > 0 (recall that we also assumed 
P(A, <1) > 0). 

We next note some representations of Rn and the ruin time, which are similar 
to some from Section 2 in the constant interest rate case (with the —Y chain 
taking the role of the Z process there). 


Proposition 5.2 Define Dn = AJ --- At, R& = DnRn and 


Y, = DB + Dios D Bn = XO DrBp. 
k=1 


Then 
R} =u- Yn, Tlu) = inf{n > 1: Y, >u}. (5.2) 


n 


Note that Azt is simply the discounting factor for period k and D,, the one 
for the totality of periods 1,...,n. Thus R} is simply the present value of the 
reserve, and the formula R% = u-— Yn tells that, as should be, this is the initial 
reserve minus the present value of claim surpluses from the different periods. 


Proof. We can rewrite (5.1) (with n replaced by k) as R% — Rý_; = —DkBk. 
Thus 


From this the claim on r(w) follows by noting that Rn < 0 if and only if Rž < 0, 
because A, > 0. 


Proposition 5.3 Assume Ay’ > 0, u* = Elog Ay! < 0 and Elog|Bi| < oo. 
Then the r.v. Y = XY DnBn satisfies P(—oo < Y < 00) = 1, and Yn => Y as 
n— oo. 
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Proof. We have a.s. that Dn = ef n(1+0(1)) | Furthermore, a standard application 
of the Borel-Cantelli lemma shows that Elog|B,| < oo ensures that for all cı one 
has log |Bn| > ncı, i.e. |B,| > e", only for finitely many n. Choosing cı < —p*, 
it follows that the terms in the series in (5.3) decay at least geometrically fast, 
which implies the assertions. 


As a corollary, we can give the 
Proof of Proposition 5.1 when Elog An > 0. 
Note that 


y(u) = P(Y, > u for some n) < P(- D;B; > u) 


{=I 


By the above, the sum is a finite r.v., and therefore the r.h.s. goes to 0 as 
u —> o. 


The ruin probability can be represented in terms of Y as follows: 


Theorem 5.4 A F: ) = P(Y > u). Then y(u) = H(u)/Cı(u) where 
Ci(u) = E[H (Rri )) | r(u) < o]. 


Proof. For brevity, write T = T(u). On {T < co} we have u— Y = u- Y,- (Y 
Y»). Here 


= Ð DB = Dy Oris Bi = D,Y 


i=r+1 


where 
[0.0] 


Y T ys tee = ah ATL Bev 


i=1 


is a copy of Y independent of 7 and R,. Thus 
u-Y = D,(Dz'\(u-Y;)-Y) = D,(R,-Y). 


Since Y,, — Y, we also have Y„ > u for some n (and hence T < co) when Y > u. 
Thus 


H(u) = P(u-Y <0) = P(R,-Y <0,7 <0) 
= y(u)P(R,-Y <0|r<00) = Y(u)E[A(R,)|T < oo]. 


Proposition 5.5 The r.v. Y satisfies Y 2 4B; +Y) where Y is a copy of 
Y which is independent of Aı, By. 
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Proof. Take 


Y= Sgt... APB, 
1=2 


The representation in Theorem 5.4 is in general quite intractable, because 
usually H and C(u) cannot be calculated. An obvious and important question 
is therefore to ask for tail asymptotics. The question that immediately comes 
up is how strongly C(u) depends on u. The following result shows that under 
suitable conditions, this dependence is weak so that the important part of the 
tail asymptotics is H(u) itself. 

y(u) y(u) 


Proposition 5.6 0 < liminf =~ < lim sup = < 1. 
s we By use- HG) 


Proof. Since —œ < R, < 0, we have H(0) < C1(u) < 1, so all that remains to 
show is H(0) > 0. Assume otherwise. Then the upper point a in the support 
of Y satisfies —oo < a < 0. Assume first a < 0. Recall that by assumption, 
pı = P(A, > 1) > 0 and/or that Bı < 0 is excluded. Choose e > 0 with 
p2 = P(B, > 26) > 0 and let p3 = P(Y > a -— €). Then p3 > 0 and hence 


P(Y>ate) = P(A;(Bi+Y)>ate) 
> pıP(Bı +Ý >ate) > Ppıpəp3 > 0. 


If instead a = 0, we have similarly that P(Y > a +€) > pappa where 
pa = P(A, < 1) > 0. In both cases, we have reached a contradiction with 
the definition of a. 


It thus remains to get some hold on H(u). We shall here involve a classical 
result due to Kesten [531] and Goldie [421] on perpetuities. By a perpetuity one 
understands a r.v. of the form 


Y = Bı + DB + D2B3+ = X` Di1Bi (5.3) 
t=1 


where {An}, {Bn} are independent sequences each consisting of i.i.d. r.v.’s and 
D, = Ay'-:- Az" (the use of reciprocals is unusual but made to conform with 
the above risk theoretic setting). The result of [421, 531] states that under 
suitable conditions, H(u) = P(Y > u) decays like an a-power of u. Before 
stating the result (the proof of which is outside the scope of this book), we give 
as a help for intuition some heuristic steps that give a heuristic motivation of 
the heavy tail and allow one to identify a. 


246 CHAPTER VII. LEVEL-DEPENDENT RISK PROCESSES 


Remark 5.7 Let T, be the random walk — log A; —---—log An so that Y = 
So eT” Ba. With light-tailed B„, we expect that a single term e?” B,, can only 
be large if Ta is so. By assumption, Tn has negative drift and we will assume 
that the conditions of the Cramér Lundberg approximation are satisfied. Then 
the probability that Tn > x for some x is approximately C’re~°” where a solves 


= ub 

Ae’ 

Choose bo > 0 with B(bo) = P(Bn > bo) > 0. For an n with Ta > x we then 
have P(e™ B„ > ebo) > B(bo). Taking x = log u — log b, it follows that 


alogA _ 


1 = Ee (5.4) 


= B(bo)b® 
P(e?" B, > u for some n) > Cre~**B(bo) = a ' 
u 


This motivates that Y is heavy-tailed and that the tail decays at least as a 
power. 

The first trial solution for tail asymptotics is therefore a power tail, P(Y > 
u) ~ C/(1+u)%. Assume as before that B, is light-tailed. Then Proposition 5.5 
implies that Y and ArT have equivalent tails, i.e. 


C 


= [PO > ua Plar! € ao) ~ i 


© Ca? = 
= f ax ye PAT" € aa). 


Multiplying by (1+ u)? and going to the limit under the integral, the r-h.s. 
becomes EA~®. This suggests that G = a, where a is as in (5.4). 


Here is the result of Kesten [531] and Goldie [421]: 


Theorem 5.8 Assume that there exists a > 0 such that EA; ° = 1 together 
with EA, “log” A, < œ and E|Bı| < co. Assume further that the distribution 
of AȚ' is non-lattice. Then for some Cz > 0, 

C2 


P(Y >u) ~ Tre’ u—> oœ. (5.5) 


The (untractable) expression for C2 is given in [531, 421] (it is shown in Nyrhi- 
nen [669] that C2 > 0). The proof of Theorem 5.8 is much too technical to 
be given here. However, up to the untractable constant we can now obtain the 
desired asymptotics of the ruin probability by combining with Theorem 5.4: 
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Corollary 5.9 Under the conditions of Theorem 5.8, y(u) ~ 


particular, 
0 < liminfu®y(u) < limsupu“y(u) < œ. 


The result follows immediately by combining Theorems 5.8, 5.4 and Proposi- 
tion 5.6 with the following lemma: 


Lemma 5.10 Under the conditions of Theorem 5.8, P(Y >u) ~ P(Y >u) ~ 
C2 /u~ . 

Proof. We have Y = A;'Y* where Y* = EY B,[]} Ay! is a copy of Y 
independent of Aı. Rewrite Theorem 5.8 as P(Y* > y) ~ C2/(1 + y)®. Then 


also P(Y* >y) < C3(1+y)® for all y > 0 since clearly such an inequality holds 
on any finite interval. Thus 


ueP(Y >u) = u f P(Y* > au) P(A; € da) 
0 


a 


= S P(Y* > au)(1 + au)* P(A; € da) 
0 


uU 
(F au)? 
5 1 
=. f Cə— P(A, E da) = Cy 
0 a“ 


by dominated convergence. 


Remark 5.11 The reason that Corollary 5.9 gives a heavier tail asymptotics 
than in Section 2 is not so much that the interest is random, but that it is 
inherent in the set-up that negative returns are possible. Namely, if Ay > 1 
(and is not degenerate at 1), then EA; is always < 1 so the conditions of 
Corollary 5.9 cannot hold. 


Remark 5.12 Nyrhinen [669] gives conditions under which C\(u) in Corollary 
5.9 is not significant in terms of logarithmic asymptotics. 


Notes and references Rather than assuming that {An}, {Bn} are independent 
sequences each consisting of i.i.d. r.v.’s, much of the literature relaxes this to the case 
that (An, Bn) being i.i.d. Ie. some dependence between An and Bn is allowed. See, 
for example, Nyrhinen [669, 668]. For models where the independence among the A; 
themselves or among the B; is relaxed see e.g. Cai [212], Cai & Dickson [214], Goovaerts 
et al. [424], Weng, Zhang & Tan [881], Shen, Lin & Zhang [796] and Collamore [252]. 

The recursion Rn = An(Rn-1 — Bn) also has some relevance as a model for the 
risk reserve in the presence of investments. The results are much as for (5.1), but will 
not be given here, see again Nyrhinen [669, 668). 
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Tail asymptotics for finite horizon ruin probabilities ~(u,n) in a setting where n 
goes to infinity with u and either A; 1 or Bı (or both) are heavy-tailed are given in 
Tang & Tsitsiashvili [830, 831]. For extensions to other ruin-related quantities see 
Yang & Zhang [903]. 

An important early paper is Paulsen [680]. Paulsen [682] surveys the literature up 
to 1998 and Paulsen [684] in the decade after that. 


6 Continuous-time ruin problems with stochas- 
tic investment 


Results for continuous-time models with stochastic investment can be obtained 
as a suitable limit of corresponding discrete-time set-ups (see the Notes). How- 
ever, continuous-time models often also enable a direct analysis that can have 
a quite different flavor from its discrete counterpart. In this section this will be 
illustrated on a heuristic level for the case of a risk reserve process of compound 
Poisson type (with Poisson intensity 3), where all the reserve is continuously 
invested in a financial market of Black-Scholes type, i.e. the risky asset is a 
geometric Brownian motion. More precisely, the resulting reserve is given by 


R, = utt-S Ui+a | R,-ds+o] R,- dBs, 


where {B+} is standard Brownian motion, ø is the volatility and a is the drift 
of a geometric Brownian motion. In view of the generators for the diffusion 
and the compound Poisson process derived in Examples II.4.1 and II.4.2 one 
observes that the generator of the resulting risk process is given by 


af = au fi(u) + Fe? fu) + fu) +6 S (flu 2) flu) Baa). 


Using Itô’s Lemma, one can now show that if a twice continuously differentiable 
and bounded function f(u) with limy_... f(u) = 0 satisfies “af (u) = 0 for u > 0 
and f(u) = 1 for u < 0, then f(u) must be the ruin probability y(u) of the 
process. Hence y(u) satisfies 


2,,2 
aup (u+ 2 


E tw) -B | Wua) Bde) +5B(u) = 0. (6.1) 
0 


with lim, Y(u) = 0.° In this way the problem of studying the ruin probability 
with stochastic investment has been reduced to the purely analytical problem 
of solving an integro-differential equation. 


5Note that for ¢ = 0 the investment is riskless and we get back to the risk model with 
constant interest rate (cf. (1.11) with p(u) = p + au). 
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Example 6.1 Assume that the claim size distribution is exponential(v). Then 
one can add the derivative of (6.1) to (6.1) multiplied by v to get rid of the 
convolution term,® which leads to 


ou? 
2 


with additional boundary condition 7)'(0) = 6w(0). After substituting w’(u) by 
another function, this is in fact a second-order ODE with polynomial coefficients 
which can be solved analytically in terms of special functions (of Heun type). 
Although the resulting formula is explicit, it is quite lengthy and we do not state 
it here. 


y” (u) + (aut o7ut+ 1+ vou? /2) p"(u) + (a — 8 + uav + v) y(u) =0 


Looking at the drift of the geometric Brownian motion, one can show that for 
a? > 2a, y(u) = 1 holds for all u > 0 (cf. e.g. Paulsen [681] or Pergamenshchikov 
& Zeitouny [691]), so it is enough to restrict to o? < 2a. In general, it is 
impossible to obtain an explicit solution of the above IDE, but one can retrieve 
asymptotic results as u — oo for a large class of claim size distributions. 


Theorem 6.2 Assume that the free reserve in the Cramér-Lundberg model is 
invested in a financial asset that is modeled by a geometric Brownian motion 
with drift a > 0 and volatility o > 0 with 2a > o°. If the claim size distribution 
is exponentially bounded, then 


plu) ~ Cul24/%" u> 00, 


for some constant C > 0. If the claim size distribution is regularly varying 
(B(x) ~ L(x)a~%, a > 0), then 


plu) sy Li (u) ymax{1—2a/o a} | Uo, (6.2) 
where L(u), L1(u) are slowly varying functions. 


Proof. For the full proof, we refer to Paulsen [683]. Here we only give a sketch 
of an analytical proof for 1 < 2a/o?° < 2 to highlight the origin of the involved 
power terms. In view of the convolution term in (6.1), it is natural to take the 
Laplace transform of (6.1) and then try to apply Tauberian theorems to use 


the asymptotic behavior of Yl- s] for s — 0 to infer information about y(u) as 
u — oo. For simplicity of notation, call g(s) = y|—s] the Laplace transform of 
the ruin probability. Since the Laplace transform of uy’ (u) is —(sg(s))’ and the 


6Cf. Section XII.3c for a more general procedure to eliminate convolution terms for a large 
class of claim size distributions. 
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Laplace transform of u?y” (u) is (s29(s))”, one obtains from (6.1) after some 
elementary calculations that 


5°g" (s) + pos g'(s) + (go + a(s)) g(s) = h(s) 


with 


2a 
o2’ 


Po =4-— = qo = 2—- qı(s) = 2(s- 6+ BBl-s})/o”, 


h(s) = 22(0)/o? — 28 Bo[—s]/o?, 


where Bo[-s] = (1- Bl-s])/(ups) is the Laplace transform of the integrated 
tail distribution. It follows that s = 0 is a regular singular point of the homo- 
geneous equation 


s°g'"(s) + pos g'(s) + (qo +.a(s)) g(s) = 0 (6.3) 


which by the usual Frobenius method has a solution of the form 


gls) = s” S cps”. (6.4) 
k=0 
Substituting this into (6.3) gives the condition r(r — 1) + por + qo = 0 for r, i.e. 
2a 
rı=—1 and rm=-2+ = (6.5) 


For 1 < 2a/o? < 2, rı — re is not an integer, so we obtain two independent 
solutions of the homogeneous ODE. The particular solution g,(s) can now be 
obtained by the classical method of variation of constants. With considerable 
but purely analytical effort (the details are omitted here) one can show that for 
exponentially bounded claim size distribution B, g,(s) tends to a constant for 
s — 0. We now look at the asymptotics for s — 0 of the full solution 


g(s) = Cisim (s) + Cos? +e na(s) + gps) 


(due to (6.4), 71(s),72(s) also tend to a constant for s — 0). A priori the 
first term would dominate, but from Theorem A6.1 this would translate into 
le w(x)da ~ Ciu and by the Monotone Density Theorem y(u) —> Ci, which 
contradicts limy—oo U(u) = 0. Hence we must have Cı = 0 and the second 
term dominates the asymptotic behavior at s — 0. Theorem A6.1 now gives 
So blede ~ Cou2-24/" IT(3 — 2a/o?) and the Monotone Density Theorem 
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implies (u) ~ C - ut-22/2" (one still needs to ensure that Cz # 0 which is 
omitted here). 

If, on the other hand, B is regularly varying with parameter a, then one can 
show that g,(s) ~ s°~'L(1/s), so that now the dominating term (which is either 
the particular solution or the second term above) is of order gee, 
The same arguments as above then imply the claimed result. 


Remark 6.3 Theorem 6.2 states that (as in the discrete risk model of the previ- 
ous section) full investment in geometric Brownian motion leads to Pareto-type 
asymptotic decay of the ruin probability even for light-tailed claim distribu- 
tions. If the tail of the claim size distribution is heavy enough (and from (6.2) 
one sees that the tail needs to be very heavy), insurance risk can still dominate 
the financial risk. 


Remark 6.4 If only a constant fraction 7 (0 < 7 < 1) of the current wealth is 
invested in the risky asset, the analysis above is exactly the same, one just needs 
to replace a by ay and o by on. In Chapter XIV we will see how to improve 
the asymptotic behavior of the ruin probability by dynamically changing the 
investment fraction 7 as a function of the current risk reserve level. 


Notes and references Early results on ruin probabilities with investments can be 
found in Frolova, Kabanov & Pergamenshchikov [374] for a Cramér-Lundberg mode 
with exponential claim sizes and investments into geometric Brownian motion (the 
explicit solution of Example 6.1 can also be found there) and bounds are derived 
in Kalashnikov & Norberg [520] in a more general set-up. For extensions to more 
general claim size distributions, see Constantinescu & Thomann [255]. Results for 
the continuous-time risk model can also be derived by using methods motivated by 
discrete-time models, see e.g. Nyrhinen [669] and for a fairly general account Paulsen 
[683]. Martingale techniques are exploited in Ma & Sun [618]. In order to assess 
y(u) for moderate size u, Paulsen, Kasozi & Steigen [688] transform the IDE (6.1) 
into an ordinary Volterra integral equation of the second k ind and design an effective 
numerical solution procedure for the latter. 

The proof technique of Theorem 6.2 outline here is made rigorous in Albrecher, 
Constantinescu & Thomann [22], and it is shown there that the method in principle 
also extends to renewal risk models with interclaim times of rational Laplace transform 
(in particular it turns out that the asymptotic result is insensitive to the choice of 
interclaim time distribution within that class). See also Wei [879] for a different method 
in the renewal set-up. Pergamenshchikov & Zeitouny [691] deal with more general 
premium rate functions and Cai & Xu [220] add perturbation to the original risk 
process. For absolute ruin probabilities in this context, see Gerber & Yang [414]. 

Kliippelberg & Kostadinova [540], Brokate et al. [205], Tang, Wang & Yuen [833] 
and Heyde & Wang [462] study the investment into exponential Lévy models in more 
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detail. For further references in this active field of research we refer again to the recent 
survey by Paulsen [684]. 


Chapter IX 


Matrix-analytic methods 


1 Definition and basic properties of phase-type 
distributions 


Phase-type distributions are the computational vehicle of much of modern ap- 
plied probability. Typically, if a problem can be solved explicitly when the rele- 
vant distributions are exponentials, then the problem may admit an algorithmic 
solution involving a reasonable degree of computational effort if one allows for 
the more general assumption of phase-type structure, and not in other cases. A 
proper knowledge of phase-type distributions seems therefore a must for anyone 
working in an applied probability area like risk theory. 

A distribution B on (0,00) is said to be of phase-type if B is the distribu- 
tion of the lifetime of a terminating Markov process {J;},.9 with finitely many 
states and time homogeneous transition rates. More precisely, a terminating 
Markov process {J+} with state space E and intensity matrix T is defined as 
the restriction to E of a Markov process tT thier on Ea = EU {A} where 
A is some extra state which is absorbing, that is, P;(J; = A eventually) = 1 for 
all i € E t and where all states i € E are transient. This implies in particular 
that the intensity matrix for {Ja} can be written in block-partitioned form as 


u0) Z 


We often write p for the number of elements of E. Note that since (1.1) is the 
intensity matrix of a non-terminating Markov process, the rows sum to zero 


lHere as usual, P; refers to the case Jo = i; if v = (vi)icE is a probability distribution, we 
write Py for the case where Jo has distribution v so that Py = ick viP;. 
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which in matrix notation can be rewritten as t + Te = 0 where e is the column 
E-vector with all components equal to one. In particular, T is a subintensity 
matrix,” and we have 


~ 


= —Te. (1.2) 


The interpretation of the column vector t is as the exit rate vector, i.e. the 
ith component t; gives the intensity in state 7 for leaving E and going to the 
absorbing state A. 

We now say that B is of phase-type with representation (E, œ, T) (or some- 
times just (a, T)) if B is the Pa-distribution of ¢ = inf{t > 0: Je = A} 
(the absorption time), i.e. B = Pa(¢ < t). Equivalently, Ç is the lifetime 
sup {t> 0: J € E} of {4}. A convenient graphical representation is the phase 
diagram in terms of the entrance probabilities a;, the exit rates t; and the tran- 
sition rates (intensities) t;;: 


<____ l tkj tik 
ti tki 
tik tk 
kl <= 
Qk 


FIGURE IX.1: The phase diagram of a phase-type distribution with 3 phases, 
E = {i,j,k}. 


The initial vector œ is written as a row vector. 


Here are some important special cases: 


Example 1.1 Suppose that p = 1 and write 6 = —tıı. Then a = a; = 1, 
tı = @, and the phase-type distribution is the lifetime of a particle with con- 
stant failure rate 3, that is, an exponential distribution with rate parameter (. 
Thus the phase-type distributions with p = 1 are exactly the class of exponential 
distributions. 


?This means that ti; < 0, tij > 0 for i Aj and Dyer tij <0. 
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Example 1.2 The Erlang distribution Ep with p phases is defined as the Gamma 
distribution with integer parameter p and density 


e sa, (1.3) 


Since this corresponds to a convolution of p exponential densities with the same 
rate ô, the E, distribution may be represented by the phase diagram (p = 3) 


œa = 1 ô ô ô A 


FIGURE IX.2 


corresponding to E = {1,...,p}, œ = (10 0...00), 


-6 6 0 0 0 0 

0 —ô 6 0 0 0 
T = ; t = 

0 0 0 —ô ô 0 

0 0 0 0 —ô ô 


Example 1.3 The hyperexponential distribution H, with p parallel channels is 
defined as a mixture of p exponential distributions with rates ô1,...,ôp so that 
the density is 


p 
X aidie. (1.4) 
i=1 
Thus E = {1,...,p}, 
—ô 0 0 > 0 0 ôi 

0 -d2 0 -::: 0 0 d2 

0 0 Oore oper ~ 0 Sp-1 

0 0 0 > 0 —dp Op 


and the phase diagram is (p = 2) 
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a1 


on 
DNN 
Q2 2 ae 


FIGURE IX.3 


Example 1.4 (COXIAN DISTRIBUTIONS) This class of distributions is popular 
in much of the applied literature, and is defined as the class of phase-type 
distributions with a phase diagram of the following form: 


— rik r |} 
at, 5 = toy 
p-l P 
p-l 


FIGURE IX.4 


ô ty 


For example, the Erlang distribution is a special case of a Coxian distribution. 


The basic analytical properties of phase-type distributions are given by the 
following result. Recall that the matrix-exponential e* is defined by the stan- 
dard series expansion X >o K"/n!. 3 


Theorem 1.5 Let B be phase-type with representation (E,a,T). Then: 
(a) the c.d.f. is B(x) = 1 — act e; 

(b) the density is - ) = B(x) = eTit 

(c) the m.g.f. Bir = [> e*B(dz) is a(-rI — T)! 

(d) the nth aes Jo. 2” B(dz) is (-1)"nlaT~"e. 


Proof. Let P° = (pij) be the s-step Ea x Ea transition matrix for {Ji} and 

P* the s-step E x E-transition matrix for {J+}, i.e. the restriction of P’ to E. 

Then for i,j € E, the backwards equation for {Ji} (e.g. [APQ, p. 48]) yields 
dp}; dpi; 


= a pom a scapes 
ae > dp ~ tiPsj + > tikPkj = 2 tikPkj - 
keE keE 


3For a number of additional important properties of matrix-exponentials and discussion of 
computational aspects, see Appendix A3. 
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That is, 4P = TP*, and since obviously P° = I, the solution is P* = efT°, 
Since 
1— B(x) D Pa(¢ > x) = Pa(Je E€ E) = 5 QiPij = aP*e, 
i,j€E 


this proves (a), and (b) then follows from 


d 
B'(x) = —a— Pre = -ae™Te = act 
x 


Tz commute). For (c), the rule (A.12) for integrating matrix- 


(since T and e 
exponentials yields 


Bir] = l e"aeT”tdr = a(/ ell+7)" dan) t = a(-rI — T) +t. 
0 0 


Alternatively, define h; = Eje’S. Then 


—tiü ti tij 

is Stii — E x 2 bi hs} (e 

j+i 
Indeed, —t;; is the rate of the exponential holding time of state i and hence 
(—ti)/(—ta — r) is the m.g.f. of the initial sojourn in state i. After that, we 
either go to state j Æ i w.p. ti; /—t, and have an additional time to absorption 
which has m.g.f. hj, or w.p. t;/ — ti we go to A, in which case the time to 

absorption is 0 with m.g.f. 1. Rewriting (1.5) as 


hy (ti + r) = =t; = 5 tijhħj, 5 tijhj + hir = —te, 
ji jEE 


this means in vector notation that (T +rTI)h = —t, i.e. h = —(T+rI)~'t, and 
since Bir] = ah, we arrive once more at the stated expression for Bfr]. 
Part (d) follows by differentiating the m.g.f., 


n 


d 
Gwen -T) Mt = e nati kp ir, 
rr 


BOO) = (1P HnlaT™ t = (1) nlar" Te 
= (-1)"nlaT"e. 


Alternatively, for n = 1 we may put k; = E;¢ and get as in (1.5) 


which is solved as above to get k = —aT'e. 
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Example 1.6 Though typically the evaluation of matrix-exponentials is most 
conveniently carried out on a computer, there are some examples where it is ap- 
pealing to write T on diagonal form, making the problem trivial. One obvious 
instance is the hyperexponential distribution, another the case p = 2 where ex- 
plicit diagonalization formulas are always available, see the Appendix. Consider 
for example 


—3/2 9/14 6/7 
a=(1/2 1/2), r-( ) so that t= ( ) 
7/2 —11/2 2 


Then (cf. Example A3.7) the diagonal form of T is 
9/10 9/70 1/10 —9/70 
T = — —6 , 
7/10 1/10 —7/10 9/10 
where the two matrices on the r.h.s. are idempotent. This implies that we can 
compute the nth moment as 


9/10 9/70 1 
(-1)"nlaT"e = 1"nl(1/2 u» ( ) ( 1 ) 
7/10 1/10 


1/10 —9/70 
467" nl (1/2 12 : / ( i ) 


—7/10 9/10 
o A35 35.67)" 


Similarly, we get the density as 


sien, eed 9/10 eae 
da E Aa, 1/10 2 


L 1/10 —9/70 ) ( a 
y G >) ( —7/10 9/10 2 


The following result becomes basic in Sections 4, 5 and serves at this stage 
to introduce Kronecker notation and calculus (see Section 4b for definitions and 
basic rules): 
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Proposition 1.7 If B is phase-type with representation (v,T), then the matrix 
m.g.f. BIQ] of B is 


BIQ] = f Baa) = vVI(-T Q (tal). (1.6) 


Proof. According to (A.29) and Proposition A4.4, 


BIQ] = ie vet te?* dr = (v@T) (f° eT? @ e87 dz) (tT) 


0 0 


= (Vel) (f aeaa) tI) = (v@I)(-TOQ)'(t@l). 


Sometimes it is relevant also to consider phase-type distributions, where the 
initial vector œ is substochastic, ||æ|| = X ;eg&i < 1. There are two ways to 
interpret this: 


e The phase-type distribution B is defective, i.e. ||B|| = ||a|| < 1; a random 
variable U having a defective phase-type distribution with representation 
(a, T) is then defined to be oo on a set of probability 1 — ||al|, or one just 
lets U be undefined on this additional set. 


e The phase-type distribution B is zero-modified, i.e a mixture of a phase- 
type distribution with representation (a@/||a||,T) with weight ||a|| and 
an atom at zero with weight 1 — |ja||. This is the traditional choice in 
the literature, and in fact one also most often there allows œ to have a 
component a, at A. 


la Asymptotic exponentiality 


Writing T on the Jordan canonical form, it is easily seen that the asymptotic 
form of the tail of a general phase-type distribution has the form 


B(x) ~ Cake", 
where C,7 > 0 and k = 0,1,2... The Erlang distribution gives an example 


where k > 0 (in fact, then k = p — 1), but in many practical cases, one has 
k = 0. Here is a sufficient condition: 


Proposition 1.8 Let B be phase-type with representation (a, T), assume that 
T is irreducible, let —n be the eigenvalue of largest real part of T, let v,h be the 
corresponding left and right eigenvectors normalized by vh = 1 and define C = 
ah-ve. Then the tail B(x) is asymptotically exponential, 


B(x) ~ Ce”, (1.7) 


260 CHAPTER IX. MATRIX-ANALYTIC METHODS 


Proof. By Perron-Frobenius theory (A.4c), 7 is real and positive, v,h can be 
chosen with strictly positive component, and we have 


eT? ~ hve", t>o. 


Using B(x) = aeT”e, the result follows (with C = (ah)(ve)). 


Of course, the conditions of Proposition 1.8 are far from necessary (a mixture 
of phase-type distributions with the respective TČ irreducible has obviously 
an asymptotically exponential tail, but the relevant T is not irreducible, cf. 
Example A5.8). 

In Proposition A5.1 of the Appendix, we give a criterion for asymptotical 
exponentiality of a phase-type distribution B, not only in the tail but in the 


whole distribution. 


Notes and references The idea behind using phase-type distributions goes back 
to Erlang, but today’s interest in the topic was largely initiated by M.F. Neuts, see his 
book [660] (a historical important intermediate step is Jensen [505]). Other expositions 
of the basic theory of phase-type distributions can be found in [APQ], Lipsky [600], 
Rolski, Schmidli, Schmidt & Teugels [746] and Wolff [894]. All material of the present 
section is standard; the text is essentially identical to Section 2 of Asmussen [68]. 

In some of the literature and also in Section XII.3, the slightly larger class of 
distributions with a rational m.g.f. (or Laplace transform) is used which may seem 
less intuitive than phase-type distributions. See in particular the notes to Section 6. 
O’Cinneide [670] gave a necessary and sufficient criterion for a distribution B with a 
rational m.g.f. B[s] = p(s)/q(s) to be phase-type: the density b(x) should be strictly 
positive for x > 0 and the root of q(s) with the smallest real part should be unique (not 
necessarily simple, cf. the Erlang case). No satisfying algorithm for finding a phase 
representation of a distribution B (which is known to be phase-type and for which the 
m.g.f. or the density is available) is, however, known. A related important unsolved 
problem deals with minimal representations: given a phase-type distribution, what is 
the smallest possible dimension of the phase space E? 


2 Renewal theory 


A summary of the renewal theory in general is given in Al of the Appendix, but 
is in part repeated below. Let U;,U2,... be iid. with common distribution B 
and definet 


MA) = E#{n=0,1,...:U, +---+U, € A} 


= ES I(Ui +-+ Un € A). 


n=0 


4Here the empty sum U; +...+ Uo is 0. 
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We may think of the U; as the lifetimes of items (say electrical bulbs) which are 
replaced upon failure, and @A) is then the expected number of replacements 
(renewals) in A. For this reason, we refer to Yas the renewal measure; if Y is 
absolutely continuous on (0, 00) w.r.t. Lebesgue measure, we denote the density 
by u(x) and refer to u as the renewal density. If B is exponential with rate 
G3, the renewals form a Poisson process and we have u(x) = 3. The explicit 
calculation of the renewal density (or the renewal measure) is often thought 
of as infeasible for other distributions, but nevertheless, the problem has an 
algorithmically tractable solution if B is phase-type: 


Theorem 2.1 Consider a renewal process with interarrivals which are phase- 
type with representation (a,T). Then the renewal density exists and is given 
by 

u(z) = ae Tttt, (2.1) 


Proof. Let i a be the governing phase process for U; and define {I} by 
piecing the 1J} together, 


J, = Gee ei, J, = dys Uy SES Ut Way sas. 


Then LI} is Markov and has two types of jumps, the jumps of the JP and the 


jumps corresponding to a transition from one J® to the next JED, A jump 
of the last type from 7 to j occurs at rate t;a;, and the jumps of the first type 
are governed by T. Hence the intensity matrix is T + ta, and the distribution 
of Jy is ae(T+**)*, The renewal density at x is now just the rate of jumps of 
the second type, which is t; in state i. Hence (2.1) follows by the law of total 
probability. 


The argument goes through without change if the renewal process is termi- 
nating, i.e. B is defective, and hence (2.1) remains valid for that case. However, 
the phase-type assumptions also yield the distribution of a further quantity of 
fundamental importance in later parts of this chapter, the lifetime of the renewal 
process. This is defined as U; + ---+ U,—1 where « is the first k with Uz; = oo, 
that is, as the time of the last renewal; since U;, = co with probability 1 — ||B|| 
which is > 0 in the defective case, this is well-defined. 


Corollary 2.2 Consider a terminating renewal process with interarrivals which 
are defective phase-type with representation (œa, T), i.e. ||a|| < 1. Then the 
lifetime is zero-modified phase-type with representation (a,T + ta). 


Proof. Just note that LI} is a governing phase process for the lifetime. 


Returning to non-terminating renewal processes, define the excess life €(t) 
at time t as the time until the next renewal following t, see Fig. IX.5. 
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E(t) 
U2 
Uı U 
Ui U2 U3 U4 
FIGURE IX.5 


Corollary 2.3 Consider a renewal process with interarrivals which are phase- 
type with representation (œa, T), and let ug = —aT te be the mean of B. Then: 
(a) the excess life E(t) at time t is phase-type with representation (v;,T) where 
V=a (T+ta)t. 

t = ae $ 
(b) E(t) has a limiting distribution as t > co, which is phase-type with repre- 
sentation (v, T) where v = —aT`'/ug. Equivalently, the density is veT”t = 
B(zx)/uB. 


Proof. Consider again the process {I} in the proof of Theorem 2.1. The time 
of the next renewal after ¢ is the time of the next jump of the second type, hence 
E(t) is phase-type with representation (v+, T) where v; is the distribution of J 
which is obviously given by the expression in (a). Hence in (b) it is immediate 
that v exists and is the stationary limiting distribution of Ty, i.e. the unique 
positive solution of 

ve=1, v(T+ta)=0. (2.2) 


Here are two different arguments that this yields the asserted expression: 


(i) Just check that —aT~'/pg satisfies (2.2): 


—_aT! 
ae gee. I oe 1, 
HB HB 
-ap (T + ta) -a + aT Tea 
Q = 
HB HB 
—-a+aea -a+a 
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(ii) First check the asserted identity for the density: since T, T~' and eT” 
commute, we get 


B(x) acT*e = aT 'ceT*Te _ 


HB HB HB 


Trt, 


Next appeal to the standard fact from renewal theory that the limiting 
distribution of €(x) has density B(x)/upB, cf. Section A.le. 


Example 2.4 Consider a non-terminating renewal process with two phases. 
The formulas involve the matrix-exponential of the intensity matrix 


Q =T+ta = o +tya, t+ t1Q2 ) ae ( -=q q ) (say). 


tig + tga, t22 + teag q2 —q2 


According to Example A3.6, we first compute the stationary distribution of Q, 


m= (mm) = (2 1 ). 


at@ ata 


and the non-zero eigenvalue À = —q; — q2. The renewal density is then 


Qt, _ Tı T2 ti 
act = (an oa) ( 7 E 
Mt T2 —T2 ti 
= (1 pa) ( =n ni ) ( t2 ) 
ty At T2(t1 — te) 
= TIT +e aya 
(m a(g) (a See ee 
= Tt + Tato +e% (aTa = anı) (t = t2) 


1 
= ae (a2 — A271) (tı — t2). 
HB 


Example 2.5 Let B be Erlang(2). Then 


_f-6 ô 0 _f 6 ô 
Q=(% “5)+(5)09=(5 -5)- 
Hence m = (1/2 1/2), A = —26, and Example 2.4 yields the renewal density as 


u(t) = (1 — et) : 
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Example 2.6 Let B be hyperexponential. Then 


92 Se erm e ae a: 


Hence 


z= ( 201 61Q2 ) 
E 6109 + d2Q4 nee) + ô2Q1 , 
A = —ô1 Q2 — 62a), and Example 2.4 yields the renewal density as 
6192 (5102+ 68201)¢ (51 — d2)a1 a2 


u(t) = 


01 Q2 + 69Q4 JQ a b204 


Notes and references Early expositions of renewal theory for phase-type dis- 
tributions are Neuts [659] and Kao [521]. The present treatment, similar to that in 
[APQ], is somewhat more probabilistic. 


3 The compound Poisson model 


3a _ Phase-type claims 


Consider the compound Poisson (Cramér-Lundberg) model in the notation of 
Chapter I, with @ denoting the Poisson intensity, B the claim size distribu- 
tion, T(u) the time of ruin with initial reserve u, {S+} the claim surplus pro- 
cess, G,(-) = P(S) € -,7(0) < co) the ladder height distribution and M = 
SUP; S+- We assume that B is phase-type with representation (a, T). 


Corollary 3.1 Assume that the claim size distribution B is phase-type with 
representation (a, T). Then: 

(a) G is defective phase-type with representation (œ, T) where ax is given by 
a, =—faT~', and M is zero-modified phase-type with representation (a4,T+ 
tæ). 

(b) y(u) = ape Ttto+)ue, 


Note in particular that p = ||G4|| = a :e. 


Proof. The result follows immediately by combining the Pollaczeck-Khinchine 
formula with general results on phase-type distributions: for (a), use the phase- 
type representation of Bo, cf. Corollary 2.3. For (b), represent the maximum M 
as the lifetime of a terminating renewal process and use Corollary 2.2. 
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Since the result is quite fundamental, we shall, however, add a more self- 
contained explanation of why the phase-type structure is preserved. The essence 
is contained in Fig. IX.6. Here we have taken the terminating Markov process 
underlying B with two states, marked by thin and thick lines on the figure. 
Then each claim (jump) corresponds to one (finite) sample path of the Markov 
process. The stars represent the ladder points S}, (x). Considering the first, we 
see that the ladder height S,, is just the residual lifetime of the Markov process 
corresponding to the claim causing upcrossing of level 0, i.e. itself phase-type 
with the same phase generator T and the initial vector a+ being the distribution 
of the upcrossing Markov process at time —S,,-. Next, the Markov processes 
representing ladder steps can be pieced together to one {m,}. Within ladder 
steps, the transitions are governed by T' whereas termination of ladder steps 
may lead to some additional ones: a transition from i to j occurs if the ladder 
step terminates in state i, which occurs at rate t;, and if there is a subsequent 
ladder step starting in j which occurs w.p. a+;. Thus the total rate is tj; +tia+;, 
and rewriting in matrix form yields the phase generator of {Mz} as T + ta. 
Now just observe that the initial vector of {Mz} is a; and that the lifelength 
is M. 


FIGURE IX.6 


This derivation is a complete proof except for the identification of a, with 
—ßaT +. This is in fact a simple consequence of the form of the excess distri- 
bution Bo, see Corollary 2.3. 
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Example 3.2 Assume that 8 = 3 and 


1 1 
b(a) = Ten se 


Thus b is hyperexponential (a mixture of exponential distributions) with a = 


(5 3), T =(-3 —7)diag so that 


a, = -baT = -3(1/2 1/2) ( Par En = (1/2 3/14), 


Bie OE Ea ae) gn cay 
st 7 2 14 7/2 -11/2 
This is the same matrix as is Example 1.6, so that as there 


9/10 9/70 1/10 -9/70 
) +e ( . 
7/10 1/10 -7/10 9/10 


| 


e(Ttta+)u c = 


Thus 


Notes and references Corollary 3.1 can be found in Neuts [660] (in the setting 
of M/G/1 queues, cf. the duality result given in Corollary III.3.6), but that such 
a simple and general solution existed does not appear to have been well known to 
the risk theory community before a rather late stage. The result carries over to B 
being matrix-exponential, see Section 6. In the next sections, we encounter similar 
expressions for the ruin probabilities in the renewal- and Markov-modulated models, 
but there the vector a+ is not explicit but needs to be calculated (typically by an 
iteration or a rootfinding). 

The parameters of Example 3.2 are taken from Gerber [398]; his derivation of y(u) 
is different. 

For further more or less explicit computations of ruin probabilities, see Shiu [800]. 

It is notable that the phase-type assumption does not seem to simplify the com- 
putation of finite horizon ruin probabilities substantially (but see Section 8). For an 
attempt, see Stanford & Stroiriski [817]. 


4 The renewal model 


We consider the renewal model in the notation of Chapter VI, with A denoting 
the interarrival distribution and B the service time distribution. We assume 
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p= HB/ua <1 and that B is phase-type with representation (œ, T). We shall 
derive phase-type representations of the ruin probabilities y(u), ©) (u) (recall 
that y(u) refers to the zero-delayed case and Y°) (u) to the stationary case). For 
the compound Poisson model, this was obtained in Section 3, and the argument 
for the renewal case starts in just the same way (cf. the discussion around Fig. 
IX.6 which does not use that A is exponential) by noting that the distribution 
G, of the ascending ladder height $,, is necessarily (defective) phase-type with 
representation (a+, T) for some vector a+ = (a4,;). That is, if we define {m,} 
just as for the Poisson case (cf. Fig. IX.6): 


Proposition 4.1 In the zero-delayed case, 

(a) G is of phase-type with representation (œ+, T), where a+ is the (defective) 
distribution of mo; 

(b) The maximum claim surplus M is the lifetime of {mz}; 

(c) {mz} is a (terminating) Markov process on E, with intensity matriz Q given 
by Q= T + ta. 


The key difference from the Poisson case is that it is more difficult to evaluate 
a. In fact, the form in which we derive a+ for the renewal model is as the 
unique solution of a fixed point problem a, = y(a,), which for numerical 
purposes can be solved by iteration. Nevertheless, the calculation of the first 
ladder height is simple in the stationary case: 


Proposition 4.2 The distribution ene of the first ladder height of the claim 
surplus process {s9} for the stationary case is phase-type with representation 
(a), T), where a) = —aT~*/p,. 


Proof. Obviously, the Palm distribution of the claim size is just B. Hence by 
Theorem III.5.5, gt) = pBo, where Bo is the stationary excess life distribution 
corresponding to B. But by Corollary 2.3, Bo is phase-type with representation 
(-aT~'/pp,T). 


Proposition 4.3 œ, satisfies a, = y(a,), where 


pla) = aA[T +ta,y] = a J eTtta+)Y A(dy). (4.1) 
0 


Proof. We condition upon T; = y and define {m} } from {St+y — S,—} in the 
same way as {Mg} is defined from {S+}, cf. Fig. IX.7. Then {m=} is Markov 
with the same transition intensities as {mz}, but with initial distribution a 
rather than a+. Also, obviously mo = mj. Since the conditional distribution 
of my given Tı = y is ae®, it follows by integrating y out that the distribution 
a of mo is given by the final expression in (4.1). 


268 CHAPTER IX. MATRIX-ANALYTIC METHODS 


{mz} 


FIGURE [X.7 
We have now almost collected all pieces of the main result of this section: 


Theorem 4.4 Consider the renewal model with interarrival distribution A and 
the claim size distribution B being of phase-type with representation (a, T). 
Then 

vu) = ayelPHorte, YOu) = aetta, (4.2) 


where a, satisfies (4.1) and a) = -aT t/a. Furthermore, œ, can be 
computed by iteration of (4.1), i.e. by 


Q = Jim, a”) where a) =0, aP = pla®), a? = pla), : 


(4.3) 


Proof. The first expression in (4.2) follows from Proposition 4.1 by noting that 
the distribution of mo is œ+. The second follows in a similar way by noting that 
only the first ladder step has a different distribution in the stationary case, and 
that this is given by Proposition 4.2; thus, the maximum claim surplus for the 
stationary case has a similar representation as in Proposition 4.1(b), only with 
initial distribution a) for mo. 

It remains to prove convergence of the iteration scheme (4.3). The term t8 in 
p(B) represents feedback with rate vector t and feedback probability vector 8. 
Hence y() (defined on the domain of subprobability vectors 8) is an increasing 


function of 6. In particular, aP >0= a) implies 


aP = yal?) > ola?) = al? 
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and (by induction) that fol} is an increasing sequence such that limp— oo a”) 


exists. Similarly, 0 = a”) < ax yields 
aP = g(a) < pla) = a 


and by induction that a”) <a, for all n. Thus, lim, a”) < ay. 


To prove the converse inequality, we use an argument similar to the proof of 
Proposition VII.2.4. Let Fy, = {Ti +---+Tn4i > 7+} be the event that {mz} 
has at most n arrivals in [71,74], and let a” = P(mp, = i; Fa). Obviously, 
a”) T œ+, so to complete the proof it suffices to show that aq”) < a”) for all 
n. For n = 0, both quantities are just 0. Assume the assertion shown for n — 1. 
Then each subexcursion of {S:47, — S'r,—} can contain at most n — 1 arrivals 
(n arrivals are excluded because of the initial arrival at time T1). It follows that 
on Fn the feedback to {m=} after each ladder step cannot exceed ar) so that 


a” <a [ THAT Wy A(dy) 


< af erar ai = yaf a = af. 
0 


We next give an alternative algorithm, which links together the phase-type 
setting and the classical complex plane approach of the renewal model (see 
further the Notes). To this end, let F be the distribution of U; — Tı. Then 


Fir] = a(-rI —T)“'t- Aļ-r] (4.4) 


whenever Ee®(")Y < oo. However, (4.4) makes sense and provides an analytic 


continuation of FU as long as —r ¢ sp(T). 


Theorem 4.5 Let r be some complex number with R(r) > 0, =r € sp(T). 
Then —r is an eigenvalue of Q = T + ta, if and only if 1 = Fir] = Al-r] Bir], 
with Bir], Fir] being interpreted in the sense of the analytical continuation of 
the m.g.f. In that case, the corresponding right eigenvector may be taken as 
(-rI —T)7't. 


Proof. Suppose first Qh = —rh. Then e@*h = e~"*h and hence 


-rh = Qh = (T+taA[Q])h = Th+ A[-rjtah. (4.5) 
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Since —r ¢ sp(T), this implies that ahAl—r] # 0, and hence we may assume 
that h has been normalized such that ahAl—r] = 1. Then (4.5) yields h = 
(—rI —T)~'t. Thus by (4.4), the normalization is equivalent to Fir] =1. 

Suppose next F[r] = 1. Since R(r) > 0 and G_ is concentrated on (—0o, 0), 
we have |G_|r]| < 1 , and hence by the Wiener-Hopf factorization identity (A.9) 
we have G[r] = 1 which according to Theorem 1.5(c) means that a4(—rI — 
T)~‘t = 1. Hence with h = (—rI — T)~'t we get 


Qh = (T +ta,)h = T(-rI-T)tt+t = -r(-rI-T)"t = rh. 


Let d denote the number of phases. 


Corollary 4.6 Suppose u < 0, that the equation Fir] = 1 has d distinct 
roots pi,-..,;pa in the domain R(r) > 0, and define h; = (—piI — T)~'t, 
Q=CD" where C is the matrix with columns hı, ..., ha, D that with columns 
—pyhy,...,—paha. Then G, is phase-type with representation (a,,T) with 
a, =a(Q—-T)/at. Further, letting vi be the left eigenvector of Q correspond- 
ing to —p; and normalized by vih; =1, Q has diagonal form 


d d 
Q = -X ping hi = -X ahini. (4.6) 
i=1 i=1 


Proof. Appealing to Theorem 4.5, the matrix Q in Theorem 2.1 has the d 
distinct eigenvalues —p1,...,—pq with corresponding eigenvectors hy,...,ha. 
This immediately implies that Q has the form C.D~! and the last assertion on 
the diagonal form. Given T has been computed, we get 


1 1 
amel ~ T) = at e+ = Ay. 


Notes and references Results like those of the present section have a long history, 
and the topic is classic both in risk theory and queueing theory (recall that we can 
identify y(u) with the tail P(W > u) of the GI/PH/1 waiting time W; in turn, 


W 2 M™ in the notation of Chapter VI). In older literature, explicit expressions for 


the ruin/queueing probabilities are most often derived under the slightly more general 
assumption that B is rational (say with degree d of the polynomial in the denominator) 
as discussed in Section 6. As in Corollary 4.6, the classical algorithm starts by looking 
for roots in the complex plane of the equation B[y|A[—y] = 1, R(y) > 0. The roots 
are counted and located by Rouché’s theorem (a classical result from complex analysis 
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giving a criterion for two complex functions to have the same number of zeros within 
a defined region). This gives d roots 71,...,7a satisfying R(y:) > 0, and the solution 
is then in transform terms 


1ta f e“y(u)du = Ee” = Iw /He-w (4.7) 


i=l i=l 


(see, e.g., Asmussen & O’Cinneide [94] for a short self-contained derivation). In risk 
theory, a pioneering paper in this direction is Tacklind [826], whereas the approach 
was introduced in queueing theory by Smith [814]; similar discussion appears in Kem- 
perman [528] and much of the queueing literature like Cohen [249], see also Chapters 
XII and XIII. 

This complex plane approach has been met with substantial criticism for a number 
of reasons like being lacking probabilistic interpretation and not giving the waiting time 
distribution / ruin probability itself but only the transform. In queueing theory, an 
alternative approach (the matrix-geometric method) has been developed largely by 
M.F. Neuts and his students, starting around in 1975. For surveys, see Neuts [660], 
[661] and Latouche & Ramaswami [574]. Here phase-type assumptions are basic, but 
the models solved are basically Markov chains and Markov processes with countably 
many states (for example queue length processes). The solutions are based upon 
iterations schemes like in Theorem 4.4; the fixed point problems look like 


R= Ao + RA, + R° A+, 


where R is an unknown matrix, and appears already in some early work by Wallace 
[870]. The distribution of W comes out from the approach but in a rather complicated 
form. The matrix-exponential form of the distribution was found by Sengupta [793] 
and the phase-type form by the first author [60]. 

The exposition here is based upon [60], which contains somewhat stronger results 
concerning the fixed point problem and the iteration scheme. Numerical examples 
appear in Asmussen & Rolski [97]. 

For further early explicit computations of ruin probabilities in the phase-type re- 
newal case, see Dickson & Hipp [314, 315]; some recent extensions are discussed in 
Section XII.3. There is also much literature on the case where A is phase-type with a 
few phases. 


5 Markov-modulated input 


We consider a claim surplus process {S+} in a Markovian environment in the 
notation of Chapter VII. That is, the background Markov process with p states 
is {J}, the intensity matrix is A and the stationary row vector is m. The arrival 
rate in background state i is A; and the distribution of an arrival claim is B;. 
We assume that each B; is phase-type, with representation say (a, T®, EM), 
The number of elements of E™ is denoted by qi. 
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It turns out that subject to the phase-type assumption, the ruin probability 
can be found in matrix-exponential form just as for the renewal model, involving 
some parameters like the ones Q or a + for the renewal model which need to be 
determined by similar algorithms. 

We start in Section 5a with an algorithm involving roots in a similar manner 
as Corollary 4.6. However, the analysis involves new features like an equivalence 
with first passage problems for Markovian fluids and the use of martingales 
(these ideas also apply to phase-type renewal models though we have not given 
the details). Section 5b then gives a representation along the lines of Theorem 
4.4. The key unknown is the matrix K, for which the relevant fixed point 
problem and iteration scheme has already been studied in VII.2. 


5a Calculations via fluid models. Diagonalization 


Consider a process { (J, Vise such that {J;} is a Markov process with a finite 
state space F and {V;} has piecewiese linear paths, say with slope r(i) on inter- 
vals where J, = i. The version of the process obtained by imposing reflection on 
the V component is denoted a Markovian fluid and is of considerable interest in 
telecommunications engineering as model for an ATM (Asynchronuous Transfer 
Mode) switch. The stationary distribution is obtained by finding the maximum 
of the V-component of the version of {(I,,V;)} obtained by time reversing the 
I component. This calculation in a special case gives also the ruin probabilities 
for the Markov-modulated risk process with phase-type claims. The connection 
between the two models is a fluid representation of the Markov-modulated risk 
process given in Fig. IX.8. 


FIGURE IX.8 
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On Fig. IX.8, p = qı = qo = 2. The two environmental states are denoted 
o,e, the phase space E°) for B, has states o, Q, and the one EB) for Be states 
%, A. A claim in state i can then be represented by an E-valued Markov 
process as on Fig. IX.8(a). The fluid model {(;,V;)} on Fig. IX.8(b) is then 
obtained by changing the vertical jumps to segments with slope 1. Thus F = 
{o,9, ae e,é&,@}. In the general formulation, F is the disjoint union of Æ and 
the B®, 


F = EU{(ia): i€ Lae EO}, r(i)=-—1, i€ E, r(i,a)=1. 


The intensity matrix for {I4} is (taking p = 3 for simplicity) 


Bra) 0 0 
A- (Bi)diag 0 Boa) 0 
0 0 bza) 
a t 0 0 | T® 0 0 
0 t(2) 0 0 T” 0 
0 0 t®) 0 0 fbi 


The reasons for using the fluid representation are twofold. First, the probability 
in the Markov-modulated model of upcrossing level u in state i of {J,} and 
phase a € EČ) is the same as the probability that the fluid model upcrosses 
level u in state (i,a) of {I,}. Second, in the fluid model Ee’ < oo for all s,t, 
whereas Eest = oo for all t and all s > sọ where sọ < oo. This implies that in 
the fluid context, we have more martingales at our disposal. 

Recall that in the phase-type case, B,[s] = -a (TË + sT)—14. Let X 
denote the matrix 


—Bya 0 0 
(Bi)diag — A 0 — Boa) 0 T 
0 0 — Bza 
AA = |70 0 0 TO 0 S ' 
0 t) 0 0 T” 0 
0 0 t 0 0 T® 


with the four blocks denoted by 44;, i, j = 1,2, corresponding to the partitioning 
of © into components indexed by E, resp. E® +--+ B®), 


Proposition 5.1 A complex number s satisfies 


A+ (6:(Bi[-s] n: 1))aiag +sI| = 0 (5.1) 
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if and only if s is an eigenvalue of ©. If s is such a number, consider the vector a 


Cc 


satisfying (A+ (3,(B; [—s] — 1) sing) @ = —sa and the eigenvector b = ( d of 


AT Ay, where c,d correspond to the partitioning of b into components indexed 
by E, resp. EY +---+ EP). Then (up to a constant) 
c= a, d = (sI — X2) 52a = X ai(sI — TË) . 
icE 


Proof. Using the well-known determinant identity 


= | X22 | - | Di — Epy Da |, 


Xi X2 
X21 X22 


with 4; replaced by 4; — sI, it follows that if 


-6a ® 0 0 
(Bi)aiag — A — sT 0 — Bra) 0 
0 0 —Bya") 
t) 0 0 | TY) -sI 0 0 mas 
0 t(2) 0 0 TO — sI 0 
0 0 t) 0 0 T® — sI 


then also 


|(Fidaiag —-A-sI + (Bia (Tr — sI)'t) = 0 


diag 


which is the same as (5.1). 
For the assertions on the eigenvectors, assume that a is chosen as asserted 
which means 


(2u = sI + Xio (sI = > E21) a = 0 ; 


and let d = (sI — £22)" ¥2,a, c= a. Then 


Meicet+ Mood = Moja — (sI — Digg — sf) (SE = Sa) Xa 
= Xa — Xa +sd = sd. 


Noting that X11c + Xı2d = sc by definition, it follows that 


(ža sa )(a) =la): 
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Theorem 5.2 Assume that © = Av A, has q = qı +: :: + qp distinct eigen- 
values sı,...,Sq with R(s,) <0 and let b”) = ( ae ) be the right eigenvector 
corresponding to sy, v=1,...,q. Then 

pilu) = ef (ee ... etc) (a |. d)*e. 


Proof. Writing Av'A;b™ = s,b™ as (A; — A,s,)b™ = 0, it follows by 
Proposition ITI.4.4 that fems Va] is a martingale. For u,v > 0, define 


w(u,v) = inf {t > 0: V, =u or V =—v}, w(u) = inf {t> 0: V, =u}, 


pilu, v; j, a) = P; (Maceo z u, La(u,v) =F (j,@)), 
piu, v; j) = P; (Vocu) =v, Iolu,w) = i), 
pi(u; J, a) = P (w(u) < 00, Lolu) T (j, a)). 


Optional stopping at time w(u, v) yields 
bY) = ee X pilu, v; j, a)d\”) +e” S pilu, ve. 
ja j 
Letting v — œo and using R(s,) < 0 yields 


Seid? = D ERD 
Jim 


Solving for the p;(u; j,a@) and noting that yi(u) = 0, pilus Jj, a), the result 
follows. 


Example 5.3 Consider the Poisson model with exponential claims with rate 
ô. Here E has one state only. To determine w(u), we first look for the negative 
eigenvalue s of X = ( . a ) which is s = —y with y = 6— 8. We can take 
a=c=1and get d=(s+6)~16 = 6/8 =1/p. Thus y(u) = e&/d = pe~™ as 
should be. 


Example 5.4 Assume that Æ has two states and that Bı, B2 are both expo- 
nential with rates 61,02. Then we get w;(u) as the sum of two exponential terms 
where the rates 51, s2 are the negative eigenvalues of 


At + bı At Br 0 

y= —Az A2+ 82) 0 —fe 
Oy 0 —0d1 0 

0 coe 
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5b Computations via K 


Recall the definition of the matrix K from VII.2. In terms of K, we get the 
following phase-type representation for the ladder heights (see the Appendix for 
the definition of the Kronecker product ® and the Kronecker sum ©): 


Proposition 5.5 G1(i,j;-) is phase-type with representation (EM, oy T®)) 


where 
0 = Bie] 8a) -K TO) (e; @ I). 
Proof. We must show that 
os i) TO 
G4 (i, j; (y, œ)) = ger Ye. (5.2) 


However, according to VII.(2.2) the I.h.s. is 
0 pa 
B f RG,j:0)B,(y 2) 


a 


0 
= af e} fej. aD eT 0-2) edr 


= Bilel Q a) f eKe0Ts dn (e; Q Det ve 
0 


Theorem 5.6 Fori € E, the P;-distribution of M is phase-type with represen- 
tation (EO +o +E, 0®, U) where 


tD HiP jak 
Uja,ky = ; : i 
Po? jAk 

In particular, 
ilu) = P;(M >u) = 0®eUte. (5.3) 
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Proof. We decompose M in the familiar way as sum of ladder steps. Associated 
with each ladder step is a phase process, with phase space E“) whenever the 
corresponding arrival occurs in environmental state j (the ladder step is of type 
j). Piecing together these phase processes yields a terminating Markov process 
with state space J icp E™, intensity matrix U, say, and lifelength M, and it 
just remains to check that U has the asserted form. Starting from Jp = i, the 
initial value of (i, œ) is obviously chosen according to 0. For a transition from 


(j,@) to (k,y) to occur when j # k, the current ladder step of type j must 


(3) 


terminate, which occurs at rate t , and a new ladder step of type k must start 


in phase y, which occurs w.p. of . This yields the asserted form of uja,ky. For 
j = k, we have the additional possibility of a phase change from a to y within 


the ladder step, which occurs at rate tY ), 


Notes and references Section 5a is based upon Asmussen [63] and Section 5b 
upon Asmussen [59]. Numerical illustrations are given in Asmussen & Rolski [97]. 

The connection to fluid models is further exploited in a series of papers by Ahn 
& Ramaswami, e.g. [9, 10]. They also involve the connection to quasi birth-death 
processes, defined as birth-death processes in a Markovian environment and with some 
modification at the boundary 0. See also Badescu et al. [116]. First passage times for 
Markov additive processes with positive jumps of phase type are discussed in Breuer 
[201]. 


6 Matrix-exponential distributions 


When deriving explicit or algorithmically tractable expressions for the ruin prob- 
ability, we have so far concentrated on a claim size distribution B of phase-type. 
However, in many cases where such expressions are available there are classical 
results from the pre-phase-type-era which give alternative solutions under the 
slightly more general assumption that B has a Laplace-Stieltjes transform (or, 
equivalently, a m.g.f.) which is rational, i.e. the ratio between two polynomials 
(for the form of the density, see Example I.2.5). An alternative characterization 
is that such a distribution is matrix-exponential, i.e. that the density b(x) can be 
written as œeT”t for some row vector a, some square matrix T and some col- 
umn vector t (the triple (a, T, t) is the representation of the matrix-exponential 
distribution /density): 


Proposition 6.1 Let b(x) be an integrable function on [0,00) and b*[0] = 
J e °*b(ax) dx its Laplace transform. Then b*[6] is rational if and only if b(x) 
is matriz-exponential. Furthermore, if 


by + b20 + b30? +... + b,0"—! 


b*10] = 
4 6” + a, O"-1 4 tan + an,’ 


(6.1) 
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then a matriz-exponential representation is given by b(x) = ae™*t where 


œ= (bı bə ... bn-1 bn), t = (00...01), (6.2) 
0 1 0 0 0 0 0 
0 0 1 0 0 0 0 

T= E (6.3) 
0 0 0 0 0 0 1 
an An-1 An—-2 An-—3 AQn-4 ... —a2 —a, 


Proof. If b(x) = ae™*t, then b*[0] = a(@I — T)~1t which is rational since each 
element of (0I — T)~! is so. Thus, matrix-exponentiality implies a rational 
transform. The converse follows from the last statement of the theorem. For a 
proof, see Asmussen & Bladt [74] (the representation (6.2), (6.3) was suggested 
by Colm O’Cinneide, personal communication). 


Remark 6.2 A remarkable feature of Proposition 6.1 is that it gives an explicit 
Laplace transform inversion which may appear more appealing than the first 
attempt to invert b*[0] one would do, namely to asssume the roots 61,...,6, of 
the denominator to be distinct and expand the r.h.s. of (6.1) as )>;"_, ¢;/(8 + ô:), 
giving D(x) = Xf; ce" / 54. 
Example 6.3 A set of necessary and sufficient conditions for a distribution to 
be phase-type are given in O’Cinneide [670]. One of his elementary criteria, 
b(x) > 0 for x > 0, shows that the distribution B with density b(x) = c(1 — 
cos(27 x))e~*, where c = 1+ 1/47”, cannot be phase-type. 
Writing 


b(x) = c(—e®7i =D? /2 = Pata cena: + e77), 


it follows that a matrix-exponential representation (68, S, s) is given by 


Qri-1 0 0 —c/2 
6 = (111), S = 0  =2mi-1 0 |, s= | —c/2 |. (6.4) 
0 0 =f c 


This representation is complex, but as follows from Proposition 6.1, we can 
always obtain a real one (a, T,t). Namely, since 


1+ 4x? 
63 + 367 + (3 +477)0 +14 42?’ 
it follows by (6.2), (6.3) that we can take 


b*[4] = 


a = (1447700), T = 0 0 i ee ea S 
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Example 6.4 This example shows why it is sometimes useful to work with 
matrix-exponential distributions instead of phase-type distributions: for dimen- 
sionality reasons. Consider the distribution with density 


15 


b(z) = ~e (Qe 7* — 1)? + ô). 

(=) = Ty 755° (@e yee) 

Then it is known from O’Cinneide [670] that b is phase-type when 6 > 0, and 
that the minimal number of phases in a phase-type representation increases to 
oo as ô | 0, leading to matrix calculus in high dimensions when 6 is small. But 


since 


15(1 + 8)02 + 12066 + 2256 + 105 
(7 + 155)63 + (1356 + 63)62 + (161 + 3450)0 + 2256 + 105 ’ 


b*[6] = 


Proposition 6.1 shows that a matrix-exponential representation can always be 
obtained in dimension only 3 independent of 6. 


As for the role of matrix-exponential distributions in ruin probability calcu- 
lations, we shall only consider the compound Poisson model with arrival rate 8 
and a matrix-exponential claim size distribution B, and present two algorithms 
for calculating y(u) in that setting. 

For the first, we take as starting point a representation of b*[0] as p(0)/q(0) 
where p,q are polynomials without common roots. Then (cf. Corollary IV.3.4) 
the Laplace transform of the ruin probability is 


B — Bp(O)/q() — p8 
(3 — 6 — Bp(0)/q(8)) ` 


EE I Te My(u) du = > (6.5) 


Thus, we have represented w[—6] as a ratio of polynomials (note that 6 must 
necessarily be a root of the numerator and cancels), and can use this to invert 
by the method of Proposition 6.1 to get y(u) = BeS”s. 

For the second algorithm, we use a representation (a,T,t) of b(x). We 
recall (see Section 3; recall that t = —Te) that if B is phase-type and (a, T, t) 
a phase-type representation with œ the initial vector, T the phase generator 
and t = —Te, then 


plu) = sa et Hee it where a, = —GaT™'. (6.6) 


The remarkable fact is that, despite that the proof of (6.6) in Section 3 seems 
to use the probabilistic interpretation of phase-type distribution in an essential 
way, then: 


Proposition 6.5 (6.6) holds true also in the matriz-exponential case. 
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Proof. Write 
č = a(0I-T)'t, b = œ(0I-T)t, bY = aœa(0I- T) Tt. 
Then in Laplace transform formulation, the assertion is equivalent to 


b — Bb* — p0 


—l1m-—l4 _ 


(6.7) 


cf. (6.5), (6.6). Presumably, this can be verified by analytic continuation from 
the phase-type domain to the matrix-exponential domain, but we shall give an 
algebraic proof. From the general matrix identity ([789, p. 519]) 


(A+UBV)'=A'—A'UB(B+ BVA 'UB)'BVA", 
with A= 0I — T, U = —t, B = 1 and V = ai, we get 


(0I -T — tæ) 
= (PL-T) + (61 -T) (1-a, (0I - T) t) a (0I - T) 


1 
= {0I =T) (0I — T) tta (0I — T)! 
1— bt 
so that 
Si wx 040% by 
—a (0I -T - ta) Tt = b} To = Hod 
Now, since 
1 
(8I -—T)'T" g(F t+ (@1-T)""), 
(0I-T) T? = iD ir 4 1 or apy" 
0 02 6? 
and 
1 = f b(x)dx = —aT't, 
0 
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we get 
ù = —BaT~'(0I-T)'t = —Ba(oI-T)'T't 
= a(t! + (0 -T)"")t = it), 
ù = -baT (0I -T) Tt = —ba(0I - T) T?’ 
Ea E a | — 
= ba (3r + pT taI -T) ‘ye 
Bea ee cee 
aa a gee 


From this it is straightforward to check that b}*/(b3, — 1) is the same as the 
r.h.s. of (6.7). 


Notes and references As noted in the references to Section 4, some key early 
references using distributions with a rational transform for applied probability calcula- 
tions are Tacklind [826] (ruin probabilities) and Smith [814] (queueing theory). A key 
tool is identifying poles and zeros of transforms via Wiener-Hopf factorization. Much 
of the flavor of this classical approach and many examples are in Cohen [249]; see also 
Dufresne [332] and Kuznetsov [563] for a recent discussion. 

For expositions on the general theory of matrix-exponential distributions, see As- 
mussen & Bladt [74], Lipsky [600] and Asmussen & O’Cinneide [94]; a key early paper 
is Cox [264] (from where the distribution in Example 6.3 is taken). 

The proof of Proposition 6.5 is similar to arguments used in [74] for formulas in 
renewal theory. 

Some relevant more recent references on matrix-exponential distributions are Bean, 
Fackrell & Taylor [150], Bladt & Neuts [173] and Fackrell [360]. 


7 Reserve-dependent premiums 


We consider the model of Chapter VIII with Poisson arrivals at rate 6, premium 
rate p(r) at level r of the reserve {R;} and claim size distribution B which we 
assume to be of phase-type with representation (E, œ, T). 

In Corollary VIII.1.9, the ruin probability y(u) was found in explicit form for 
the case of B being exponential (for some remarkable explicit formulas due to 
Paulsen & Gjessing [687], see the Notes to VIII.1, but the argument of [687] does 
not apply in any reasonable generality). We present here first a computational 
approach for the general phase-type case (Section 7a) and next (Section 7b) a 
set of formulas covering the case of a two-step premium rule, cf. VIII.1a. 
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7a Computing y(u) via differential equations 


The representation we use is essentially the same as the one used in Sections 3 
and 4, to piece together the phases at downcrossing times of {R,} (upcrossing 
times of {.5;}) to a Markov process {mzs} with state space E. See Fig. IX.9, 
which is self-explanatory given Fig. IX.6. 


FIGURE IX.9 


The difference from the case p(r) = p is that {mz}, though still Markov, 
is no longer time-homogeneous. Let P(t,,t2) be the matrix with ijth element 
P(m, = 7| Mma =i), 0 < tı < to <u. Define further v;(u) as the probability 
that the risk process starting from Rp = u downcrosses level u for the first time 
in phase i. Note that in general J cpg vilu) < 1. In fact, )0,-,%(u) is the 
ruin probability for a risk process with initial reserve 0 and premium function 
p(u+-). Also, in contrast to Section 3, the definition of {Mms} depends on the 
initial reserve u = Ro. 

Since v(u) = (v;(u)), cr ÍS the (defective) initial probability vector for {mz}, 
we obtain 


plu) = P(m, € E) = v(u)P(O,uje = A(uje (7.1) 
where A(t) = v(u)P(0,t) is the vector of state probabilities for mz, i.e. A;(t) = 


P(m: = i). Given the v(t) have been computed, the A(t) and hence y(u) is 
available by solving differential equations: 


Proposition 7.1 (0) = v(u) and X' (t) = A(t)(T+tv(u—t)),0<t<u. 
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Proof. The first statement is clear by definition. By general results on time- 
inhomogeneous Markov processes, 


P(t;,t2) = exp {f Q(v) av} (7.2) 


where a 

QH = ZPE t+s)- 1] = (7.3) 
However, the interpretation of Q(t) as the intensity matrix of {mz} at time t 
shows that Q(t) is made up of two terms: obviously, {mz} has jumps of two 
types, those corresponding to state changes in the underlying phase process and 
those corresponding to the present jump of {R;} being terminated at level u — t 
and being followed by a downcrossing. The intensity of a jump from i to j is 
tij for jumps of the first type and t,v;(u — t) for the second. Hence Q(t) = 
T + tv(u — 2), 


A(t) = AÐRU) = A(t)(T + tv(u— t)). 


Thus, from a computational point of view the remaining problem is to eval- 
uate the v(t), 0<t<u. 


Proposition 7.2 Fori € E, 


-vi(uplu) = Bay + vi(u){ Y ytl) — BY + Y utap). (7-4) 
jEE jEE 

Proof. Consider the event A that there are no arrivals in the interval [0, dt], 
the probability of which is 1 — dt. Given A‘, the probability that level u is 
downcrossed for the first time in phase i is a;. Given A, the probability that 
level u + p(u)dt is downcrossed for the first time in phase j is vj (u + p(u)dt). 
Given this occurs, two things can happen: either the current jump continues 
from u + p(u)dt to u, or it stops between level u + p(u)dt and u. In the first 
case, the probability of downcrossing level u in phase 7 is 


õji(1 + p(u)dt - ty) + (1 — 5j)p(u)dt- tj; = Sji + plu)tji dt, 


whereas in the second case the probability is p(w)dt-t;1;(u). Thus, given A, the 
probability of downcrossing level u in phase 7 for the first time is 


XC yj (u+ plu)dt) [5j; + plu)dt - tji + p(u)dt - tjvi(u)] 
jEE 


= nlu)+ vilu)plu)dt + p(u) dt 5 {tji + tjvi(u)} 
jEE 
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Collecting terms, we get 


vilu) = aidt + (1 — Bdt)y;(u) + vi(u)p(u) dt + p(u) dt 5 {tji + tjvi(u)}. 
jEE 


Subtracting v;(u) on both sides and dividing by dt yields the asserted differential 
equation. 


When solving the differential equation in Proposition 7.2, we face the diffi- 
culty that no boundary conditions is immediately available. To deal with this, 
consider a modification of the original process {R;} by linearizing the process 
with some rate p, say, after a certain level v, say. Let p’(t), Ry, P” etc. refer to 
the modified process. Then 


pO ae 


p r>v 
and (no matter how p is chosen) we have: 


Lemma 7.3 For any fixed u > 0, v;(u) = lim v? (u). 

F080. 
Proof. Let A be the event that the process downcrosses level u in phase i given 
that it starts at u and let B, be the event 


B, = {o <œ, supy > v} 
t<o 


where o denotes the time of downcrossing level u. Then P(B,) is the tail of 
a (defective) random variable so that P(B,) — 0 as v — ov, and similarly 
P’(By) > 0. 

Since the processes R, and R? coincide under level B,, then P(A N BS) = 
P’(ANM BS). Now since both P(AN B,) > 0 and P*(AN B,) > 0 as v > œo we 
have 


P(A)-—P*’(A) = P(ANB,)+P(AN BS) —P*(ANB,) — P*(AN BS) 
= P(ANB,)—P*(ANB,) 
— 0 
as v — œ. 


From Section 3, we have 


pr»)=p => nlu) = —-aTe;, (7.5) 
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which implies that v?(v) is given by the r.h.s. of (7.5). Thus, we can first 
for a given v solve (7.4) backwards for {v?(t)},.;59, starting from v’(v) = 
—BxT~'/p. This yields v?(u) for any values of u and v such that u < v. 
Next consider a sequence of solutions obtained from a sequence of initial values 
{v;?(u)},, where, say, v = u, 2u,3u etc. Thus we obtain a convergent sequence 
of solutions that converges to {1%4(t)} sis: 


Notes and references The exposition is based upon Asmussen & Bladt [75] which 
also contains numerical illustrations. 

The algorithm based upon the numerical solution of a Volterra integral equation 
(Remark VIII.1.10, numerically implemented in Schock Petersen [695]) and the present 
one based upon differential equations require both discretization along a discrete grid 
0,1/n,2/n,.... However, typically the complexity in n is at best O(n”) for integral 
equations but O(n) for differential equations. The actual precision depends on the 
particular numerical scheme being employed. The trapezoidal rule used in [695] gives 
a precision of O(n’), while the fourth-order Runge-Kutta method implemented in 
[75] gives O(n-°). 


7b Two-step premium rules 


We now assume the premium function to be constant in two levels as in VIII.1a, 


ae (7.6) 


We may think of process Rs as pieced together of two standard risk processes 
Ri and R? with constant premiums p1, p2, such that R, coincide with R? under 
level v and with R? above level v. Let y*(u) = afe Tta P ue denote the ruin 
probability for Ri where a’, = a? = —BaT~'/p;, cf. Corollary 3.1. We recall 
from Proposition VIII.1.12 that in addition to the (.), the evaluation of y(u) 
requires $,(u) = 1 — Y™® (u)/(1 — y™® (v)), 0 < u < v, which is available since 
the y®(-) are so, as well (u), the probability of ruin between ø and the next 
upcrossing of v, where o = inf {t> 0: Ri <v}. 

To evaluate m(u), let v(u) = aP eT Hap u), assuming u > v for the 
moment. Then v(u) is the initial distribution of the undershoot when down- 
crossing level v given that the process starts at u, i.e. for u > v the distribution 
of v — Ro (defined for ø < œ only) is defective phase-type with representation 
(v(u), T). Recall that ¢,(w) is the probability of upcrossing level v before ruin 
given the process starts at w < v. Therefore 


mu) = l. v(uje™*t(1 — dy (vu — «)) dx + v(u)jeT”e (7.7) 
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(the integral is the contribution from {R, > 0} and the last term the contribu- 
tion from {Ro < 0}). The integral in (7.7) equals 


i 7 : nyt — YO 2) 
f v(u)eT tæ- f v(u)eT EOE dx 


= 1-v(u)eTe l T fı v(u)jeT”e — i! : v(ujeT ty (v — x) ach 


1— y® 
from which we see that 
us 1 ° Tx4,/,(1) 1 Tv 
m(u) = Hap v(u)e ty (v x) dx 1 pa) (v) (1 v(u)e e) 


The integral in (7.8) equals 
J ta eda 
0 


which, using Kronecker calculus (see A.4), can be written as 
(2) = (2) 
(v(u) 8 ac?) eTa] ») (T @(-T- ta®)) ae (Sect i” I} (toe). 


Thus, all quantities involved in the computation of y(u) have been found in 
matrix form. 


Example 7.4 Let {RI} be as in Example 3.2. Ie., B is hyperexponential 
corresponding to 


1 1 -3 0 3 


yields y(u) = 1, so 


_ 85 — 24e~% — e & 
~ 35 — 24e-¥ — e-6v ` 


Let A, = —3 + 2V2 and Ay = —3 — 2V2 be the eigenvalues of T + ta. Then 
one gets 


1 
e“ +e" = olu) 


2+1 1— v2 1 1 
v(u) = (LH ete + L V2 gato EEN +4 = 


10 1A alu 10 1 PA 
y® (u-v) = (z - sv2)e*" ye (s+ sv2)e™ X 
20 


yO), = a1" 
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From (7.7) we see that we can write 7(u) = v(u)V2 where V2 depends only on 
v, and one gets 
12e°" — 2 


35e6" — 2405" — 1 


40°” + 6 
35e — 2405” — 1 


Thus, z(u) = pi2(u)/pii(u) where 
pulu) = 35e%— 24e°” — 1, 


Pray) = (F -avaj + +(e a 


+(F +4va)e Muo) y +(2- 2072) ati), 


V2 = 


-R A2(u—v) 


In particular, 


192e° + 8 
21(35e% — 24e5" — 1)’ 
192e5 + 8 
35e8¥ + 1685" +7 ` 


Tv) = 


pv) = 


Thus all terms involved in the formulae for the ruin probability have been ex- 
plicitly derived. 


Notes and references The analysis and the example are from Asmussen & Bladt 
[75]. 


8 Erlangization for the finite horizon case 
We consider the Cramér-Lundberg model with parameters 3, B and recall from 


Corollary V.3.5 that an explicit formula for the Laplace transform w.r.t. u of 
e787 (u); T(u) < oo] can be found in terms of the root —ps < 0 of 


x(r) = (Bfr]-1) =ô. (8.1) 


Thus the finite horizon ruin probability y(u, T) can in principle be computed 
exactly via a double Laplace transform inversion. Now transform inversion is 
never entirely straightforward and even less so when it is higher-dimensional. 
We present in this section a numerical scheme that basically only requires a 
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rootfinding and the computation of a matrix-exponential under the assump- 
tion that the claim size distribution is phase-type (E,a,T). The basic idea is 
to replace the deterministic time horizon T by a r.v. Hy, that has an Erlang 
distribution with k stages and mean T, that is, with density 


oktk-1 
(k—1)! 


e7% where 6=6,=k/T. 


That is, we compute 


oo kyk—-1 
dy(u) = E(u, He) -f vui e dt. (8.2) 


Since the s.c.v. of the Erlang distribution goes to 0 as k — ov, of course also 
wr(u) > y(u, T). The case k = 1 of an exponential time horizon then comes 
out fairly easily, whereas a simple recursion scheme exists for going from k to 
k+1. Combining with an extrapolation idea yields a considerable improvement 
of the numerical scheme. 

The approximation w(u, T) ~ pp(u) could be called Erlang smoothing. Name- 
ly, (8.2) means that we approximate Y%(u, T) by the function y(u, t) of t smoothed 
by the kernel which is the Erlang density with mean T. Cf. Fig. IX.10. 


Figure IX.10: Erlang smoothing 


Proceeding to the details, one may first note that the model is a special 
case of the Markovian environment model in Chapter VII. Namely, the state 
J, of the environment at time t is the current exponential stage 1,...,k of 
Tk. However, we have the difference that here the environmental process J is 
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terminating, whereas Chapter VII concentrates on J being ergodic. Indeed, J 
has terminated by time t if Hẹ < t. Nevertheless, we may proceed along similar 
ideas as in VII.2, only do we now reverse the sign and not the time. More 
precisely, we define Y, € Ek = {1,...,k} x E to have the value (i,j) if the 
upcrossing of {S;} of level x occurs in state j of the phase-type jump leading to 
the upcrossing and if H;, at that time is in stage i. Obviously, {Yz} is a Markov 
process, and it is terminating since Hg < oo. Furthermore, a jump occurs in two 
ways: as consequence of a jump in the Markov process underlying the current 
claim. This changes only j, not i, so that the matrix of corresponding rates is 
I@T. Or the current claim may terminate, in which case the new state will 
be (k, 2), with k > i and £ € E the phase at the next ladder point. Denoting 
by a) the row vector of state probabilities when the first ladder point of {94} 
occurs in Erlang stage k, it follows that the intensity matrix U of Y is given by 


ta) ta® ta® ... ta® 

0 ta ta® ... talk) 

U = IT + 0 0 ta ... ta» 
0 0 Oo... ta® 


(recall that t = —Te denotes the exit vector of the phase-type distribution). 
We further get Yp(u) = a*eU "e where a* = (a a® ... a). Thus, it 
only remains to compute the a, 

We first consider the exponential case k = 1. 


Theorem 8.1 Define as = a). Then as = Ba(psI —T)~! where —ps is the 
negative root of K(r) = ô, i.e. B(a(—rI —T)-1t-1) —r=6. 


Proof. We condition upon the time t of the first claim where S+- = —t. The 
exponential time exceeds t w.p. e~°', and so, proceeding again along the lines 
of the second proof of Corollary 3.1, we conclude that 


as = | Bete t ael THa) dt = ba((8+8)I — T— tas) ` l 
0 


Thus 
(8+ d)a5—asT —astas = Ba. (8.3) 


For brevity, write v = a(psI—T)~+. We will show that as = Gv satisfies (8.3). 
We first note that the definition of ps implies Bvt = 8 + 6 — ps and that 


vT = a(psI—T)"'(-psE +T) + a(psI -T) pI = -a+ psv. 
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Inserting as = Gv in the 1.h.s. of (8.2), we therefore obtain 
(3? + BSV + Ba — Bpsv — B(B + ô — ps) 


which equals Ga, as should be. We omit the proof that a(psI — T)~! is the 
correct one among the solutions of (8.2). 


For a general k, we have the following recursion: 


Theorem 8.2 alt) = (50{"” oft 5 en ta?) (051 —T - T 
j=2 


The proof is more complicated, and we refer to Asmussen, Avram & Usabel [71]. 


The algorithm can be improved by Richardson extrapolation. This is a gen- 
eral method (see e.g. Press et al. [715]) for computing a number w accurately 
using a sequence wy, — w for which the convergence rate is known, 


Cc d 


w Wk = g t pete: (8.4) 


Here c is typically unknown but can be eliminated. Indeed, letting w% = (k + 
1)we41 — kwe, it is clear that wy — w and that one obtains an improved 
approximation of convergence rate O(k~!~*). 

In the present setting, w = y(u, T), wk = Yp(u) and (8.4) simply follows by 
the CLT for the underlying Erlang r.v. Hy: 


Ue(u) = Ey(u, Ap) 
Z fw u, T) + Yr(u, T)(Hp — T) + vrr(u,T)(Hy — T)?/2 +++ 
— w(u u, T) +0 + vrr(u, T)Var( Ax) + 


= ¥(uT)+ +t 


where as usual Yr, Ųrr are the first and second order partial derivatives of p 
w.r.t. T. 


Example 8.3 For an illustration of the method, consider a highly skewed claim 
size distribution, namely a mixture of three exponential distributions with rates 
0.015, 0.190, 5.51 and corresponding weights 0.004, 0.108 and 0.888. Choose the 
safety loading ņn = 0.1 and T = 1, u = 0. The exact value y(u, T) = 2.28% 
was calculated by transform inversion. Figure IX.11 shows the results of the 
Erlangization, with the circles corresponding to the simple method and the 
filled ones to the extrapolated values. 
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OCA) |G A sete se 


0.9 plu, T) F---------------- 


Figure IX.11: Erlangization and extrapolation 


It is seen that the simple method produces good results even in the range 
k =5-7. This is maybe somewhat surprising, since the Erlang(7) distribution is 
quite far from being degenerate. The precision of the extrapolation method is 
remarkable. Even the value for k = 1 would suffice for all practical purposes! 


Notes and references The exposition is based upon Asmussen, Avram & Us- 
abel [71] (who also consider general phase-type horizons). Note, however, that Theo- 
rem 8.1 appears already in Avram & Usabel [112]. See also Ramaswami, Woolford & 
Stanford [723]. 

Exponential/Erlangian time horizons have also been used in finance, where the 
idea is known as Canadization. An early classical reference for the exponential case is 
Carr [223]. Erlangian horizons occur, e.g., in Kyprianou & Pistorius [566]. 
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Chapter X 


Ruin probabilities in the 
presence of heavy tails 


1 Subexponential distributions 


We are concerned with distributions B with a heavy right tail B(x) = 1— B(x). 

A rough distinction between light and heavy tails is that the m.g.f. Bir] = 
fe"” B(dz) is finite for some r > 0 in the light-tailed case and infinite for all 
r > 0 in the heavy-tailed case. For example, the exponential change of measure 
techniques discussed in ITI.3, [V.4—6 and at numerous later occasions require a 
light tail. Some main cases where this light-tail criterion are violated are 
(a) distributions with a regularly varying tail, B(x) = L(x)/x® where a > 0 
and L(x) is slowly varying, L(tx)/L(x) — 1, x — on, for all t > 0; 
(b) the lognormal distribution (the distribution of e” where U ~ N(,07)) with 
density 

1 
Vana? 


(c) the Weibull distribution with decreasing failure rate, B(x) = e`" with 
0<6<1. 
For further examples, see [.2b. 

The definition B[r] = oo for all r > 0 of heavy tails is too general to allow 
for general non-trivial results on ruin probabilities, and instead we shall work 
within the class Yof subexponential distributions. For the definition, we require 
that B is concentrated on (0,00) and say then that B is subexponential (B € A) 
if 


e7 (log 2—")?/207, 
7 
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B(x) 
Here B*? is the convolution square, that is, the distribution of the sum of inde- 
pendent r.v.’s X1, X2 with distribution B. In terms of r.v.’s, (1.1) then means 
P(X, +X > x) ~ 2P(X1 > x). 
To capture the intuition behind this definition, note first the following fact: 


> 2, t>. (1.1) 


Proposition 1.1 Let B be any distribution on (0,00). Then: 
(a) P(max(Xı, X2) > z) ~ 2B(x), x — œ. 


*2 
(b) lim inf = (%) 5 4 


(z) 


Proof. By the inclusion-exclusion formula, P( max(X1, X2) > 2) is 


P(X, > 2) + P(X: >2)—P(X, >2,X2>2) = 2B(x)— B(x)? ~ 2B(x), 


proving (a). Since B is concentrated on (0,00), we have {max( X1, X2) > x} C 
{X1 + X2 > x}, and thus the liminf in (b) is at least lim inf P(max(X1, X2) > 
x)/B(x) =2.! 

The proof shows that the condition for B € Zis that the probability of the 
set {X1 + X2 > z} is asymptotically the same as the probability of its subset 
{max(X 1, X2) > x}. That is, in the subexponential case the only way X1 + X2 
can get large is roughly by one of the X; becoming large. We later show: 


Proposition 1.2 If Be Z, then 
1 1 
P(X >2|X1+X2>x) > 5, P(X < y| Xi + Xo > z) > 5 Bly). 


That is, given Xı + X2 > x, the r.v. Xj is w.p. 1/2 ‘typical’ (with distribution 
B) and w.p. 1/2 it has the distribution of X1|X, > x. In contrast, the behavior 
in the light-tailed case is illustrated in the following example: 


Example 1.3 Consider the standard exponential distribution, B(x) = e~*. 
Then X, + Xə has an Erlang(2) distribution with density ye~Y so that B*?(x) ~ 
xze—*. Thus the liminf in Proposition 1.1(b) is 00. In contrast to Proposition 
1.2, one can check that 


(= Xə 


’ 
£ x 


)|aa+ xa> a Z (U,1—U) 


‘Note that it can be shown that for any heavy-tailed distribution one has in fact the 
stronger result lim infz—.oo B*?(x)/B(x) = 2, see Foss & Korshunov [365]. 
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where U is uniform on (0,1). Thus, if X1 + X2 is large, then (with high proba- 
bility) so are both of X1, X2 but none of them exceeds zx. 


Here is the simplest example of subexponentiality: 
Proposition 1.4 Any B with a regularly varying tail is subexponential. 
Proof. Assume B(x) = L(x)/x* with L slowly varying and a > 0. Let 0 <6 < 


1/2. If X; + X2 > x, then either one of the X; exceeds (1 — 5), or they both 
exceed dx. Hence 


: B*? (zx) 2B((1 — 6)x) + B(x)? 
aa ee Bla) 
oor 2L((1—46)a)/((1— dja)" 2 
= lim aip L(a)/x2 t0 = G5" 


Letting ô | 0, we get lim sup B*?(x)/B(£) < 2, and combining with Proposition 
1.1(b) we get B*?(x)/B(x) > 2. 


We now turn to the mathematical theory of subexponential distributions. 


Ble — 
Proposition 1.5 If B € Z, then Ae) 
£ 


LO. 


— 1 uniformly in y € [0, yo] as 


(In terms of r.v.’s: if X ~ B € YX then the overshoot X — z| X > x converges in 
distribution to oo. This follows since the probability of the overshoot to exceed 
y is B(x + y)/B(x) which has limit 1.] 


Proof. Consider first a fixed y. Using the identity 


eC eee Ea Fay PO 02) 


Beer) (a). B(x) — Bt) (x) E +f ea 
0 


with n = 1 and splitting the integral into two corresponding to the intervals 
[0, y] and (y, x], we get 


If lim sup B(x — y)/B(x) > 1, we therefore get lim sup B*?(x)/B(x) > 1+B(y)+ 
1 — B(y) = 2, a contradiction. Finally lim inf B(x — y)/B(x) > 1 since y > 0. 
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The uniformity now follows from what has been shown for y = yo and the 
obvious inequality 


Bie-y) _ Ble-w) 
Biz) `~ B(x) 


1 < , y € [0, y0]. 


Corollary 1.6 If B € Z, then e B(x) > œ, Ble] = œ for alle > 0. 


Proof. For 0 < 6 < e, we have by Proposition 1.5 that B(n) > e~°B(n — 1) for 
all large n so that Bin ) > cye7 for all n. This implies B(x) > c2e~** for all 
x, and this immediately yields the desired conclusions. 


Proof of Proposition 1.2. 


P(X, >#|X1+X2>2a) = = 


P(X1 <y|Xi+ X2 > 2) ~ ea 


using Proposition 1.5 and dominated convergence. 


The following result is extremely important and is often taken as definition 
of the class .Y; its intuitive content is the same as discussed in the case n = 2 
above. 


Proposition 1.7 If B€ Z, then for any n, B*"(x)/B(x) > n as x > oo. 


Proof. We use induction. The case n = 2 is just the definition, so assume 
the proposition has been shown for n. Given e > 0, choose y such that 
|B*" (a) )/B(z ) =n < e for x > y. Then by (1.2), 


ERA = a ( [7 TE P 


Here the second integral can be bounded by 


ap TO BO = Bla =v) 


v20 B(v) B(x) 
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which converges to 0 by Proposition 1.5 and the induction hypothesis. The first 
integral is 


= (n+ O(e)) (ai Sa ee — B(az)} . 


Here the first term in {-} converges to 1 (by the definition of B € .7) and the 
second to 0 since it is bounded by (B(x) — B(x — y))/B(x). Combining these 
estimates and letting € | 0 completes the proof. 


Lemma 1.8 If B € Z, € > 0, then there exists a constant K = K. such that 
Ber(z) < K(1 +€)” B(x) for alln and z. 


Proof. Choose T such that (B(x) — B*?(a))/B(x) < 1+ e€ for x > T and let 
A = 1/B(T), an = sup,>9 B*"(«)/B(x). Then by (1.2), for all n 


An+1 
< 1+ sup f 2 aoe) B(dz) + sup ee C BOA B(dz) 
z<T Jo B(x) z>TJo B(x—-z) B(x) 
< 14+A+a, sup BI 2) Gd < 14+4A+a,(1+.€). 


Iterating, we get with ay = 1 


1—(1+.e)” 
1-049" | 


Qn41 < (1+ A) (l+e)”. 


Take K = (1 + (1 + A)/e)/(1 +6). 


Proposition 1.9 Let Aj, A» be distributions on (0,00) such that A;(x) ~ a; B(x) 
for some B € Z and some constants a,,a2 with a, + a2 > 0. Then A, * A(x) 


~ (a, + a2)B(az). 


Proof. Let X1, X2 be independent r.v.’s such that X; has distribution A;. Then 
by definition A, * Aə(x) = P(X, + X2 > x). For any fixed v, Proposition 1.5 
easily yields 


0 


~ ajB(n)Ai(v) = ajB(x)(1 + 0v(1)) 


298 CHAPTER X. HEAVY TAILS 


(j = 3-7). Since 
P(X, + X2 > z, Xı >2-v,X2>a2-v) < Aj(x—v)Ao(x—v) ~ aya2B(xr)? 


which can be neglected, it follows that it is necessary and sufficient for the 
assertion to be true that 


f Ai@-wAilay) = Boo) (1.3) 


Using the necessity part in the case Aj = Ao = B yields 


Now (1.3) follows if 


J Bi-sa) = Bion). (1.5) 


By a change of variables, the 1.h.s. of (1.5) becomes 


BOAT) = A= Be i Aiea Bag: 


Here approximately the last term is B(x)o,(1) by (1.4), whereas the two first 
yield B(x)(Ai(v) — a, B(v)) = B(x)oy(1). 


Corollary 1.10 The class Zis closed under tail-equivalence. That is, if A(x) ~ 
aB(x) for some B € Z and some constant a > 0, then A € JZ. 


Proof. Taking A; = Ay = A, a1 = a2 = a yields A*?(x) ~ 2aB(x) ~ 2A(2). 


Corollary 1.11 Let B € Z and let A be any distribution with a lighter tail, 
A(x) = 0(B(z)). Then Ax BE Sand Ax B(x) ~ B(x). 


Proof. Take A, = A, A = B so that a, = 0, ag = 1. 


It is tempting to conjecture that Zis closed under convolution, i.e. By * B2 € 
fF and Bı x Ba(x) ~ By(x) + B2(x) when B1, B2 E€ Z. However, By * By E€ S 
does not hold in full generality (but once Bı* Bz € has been shown, Bı x Bo(x) 
~ B(x) + B(x) follows precisely as in the proof of Proposition 1.9). In the 
regularly varying case, it is easy to see that if L1, L2 are slowly varying, then so 
is L = Ly, + Lə. Hence: 
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Corollary 1.12 Assume that B;(x) = L;(x)/x*, i = 1,2, with a > 0 and 
Lı, Lo slowly varying. Then L = Lı + Le is slowly varying and Bı * Bo(x) ~ 
L(x) /x*. 


We next give a classical sufficient (and close to necessary) condition for 
subexponentiality due to Pitman [705]. Recall that the failure rate A(x) of a 
distribution B with density b is A(x) = b(x)/ B(x). 


Proposition 1.13 Let B have density b and failure rate A(x) such that A(x) is 
decreasing for x > xo with limit 0 at oo. Then B € Z provided 


| er b(x) dx < oo. 
0 


Proof. We may assume that A(x) is everywhere decreasing (otherwise, replace 
B by a tail equivalent distribution with a failure rate which is everywhere de- 
creasing). Define A(x) = fy A(y) dy. Then B(x) = e740). By (1.2), 


a) 


5 
2 


x NGAE 
BOW) ay = J een A ydy 
B(x) 0 


lI 
~ 
w 


z/2 z/2 
= i; eAl2)-A@—W)-AW) A (y)dy + ? eA(2)-Al@—W)-AW) A(x — y) dy. 
0 0 


For y < 2/2, 
A(x) — A(z- y) < yAa@—y) < yAy). 


The rightmost bound shows that the integrand in the first integral is bounded by 
eya) AU) A(y) = eM (y), an integrable function by assumption. The middle 
bound shows that it converges to b(y) for any fixed y since A(x — y) > 0. Thus 
by dominated convergence, the first integral has limit 1. Since A(x — y) < A(y) 
for y < x/2, we can use the same domination for the second integral but now 
the integrand has limit 0. Thus B*2(x)/B(z) —1 has limit 1+0, proving B € Z. 


Example 1.14 Consider the DFR Weibull case B(x) = e7% withO<6 <1. 
Then b(x) = Bx8-1e-*", A(x) = Bx8-!. Thus A(x) is everywhere decreasing, 
and e™()b(x) = Br®-te-(-9)" is integrable. Thus, the DFR Weibull distri- 
bution is subexponential. 


300 CHAPTER X. HEAVY TAILS 


Example 1.15 For the lognormal distribution, 


e7 log 2—H)*/20* 1 (7/2770?) log x 
B(—(loge — 1)/o) oe 


Na) = 


This yields easily that e**) b(x) is integrable. Further, elementary but tedious 
calculations (which we omit) show that A(x) is ultimately decreasing. Thus, the 
lognormal distribution is subexponential. 


In the regularly varying case, subexponentiality has already been proved in 
Corollary 1.12. To illustrate how Proposition 1.13 works in this setting, we first 
quote Karamata’s theorem (Bingham, Goldie & Teugels [169]): 


Proposition 1.16 For L(x) slowly varying and a > 1, 


pO 


vÀ (a= 127 


From this we get 


Proposition 1.17 If B has a tail of the form b(x) = aL(x)/x°** with L(x) 
slowly varying and a> 1, then B(x) ~ L(x)/x® and A(x) ~ afz. 


Thus ce) p(x) ~ e®b(a) is integrable. However, the monotonicity condition in 
Proposition 1.13 may present a problem in some cases so that the direct proof 
in Proposition 1.4 is necessary in full generality. 


We conclude with a property of subexponential distributions which is often 
extremely important: under some mild smoothness assumptions, the overshoot 
properly normalized has a limit which is Pareto if B is regularly varying and 
exponential for distributions like the lognormal or Weibull. More precisely, let 
X() = X—2|X > a and define the mean excess function e(x) = EX) (in insur- 
ance mathematics and in particular in reinsurance, the term stop-loss transform 
for the unconditional expectation E(X — x)+ = f° B(y) dy is common). Then: 


Proposition 1.18 (a) If B(x) = L(x)/x® with L(x) slowly varying and a > 1, 
then e(x) ~ x/(aœ— 1) and 


(2) Je l 
P(X/e(2) > 8) > re (1.6) 
(b) Assume that for any yo the failure rate X(-) satisfies 
A(a + y/A(2)) i 1.7) 


A(x) 
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uniformly for y € (0, yo]. Then e(x) ~ 1/X(x) and 
P(X /e(x) >y) ae; (1.8) 


(c) Under the assumptions of either (a) or (b), f° B(y) dy ~ e(x)B(a). 


Proof. (a): Using Karamata’s theorem, we get 


vty = BAA... 1 0 
ANE Ee e = eee BA euy 
_ 1 fava, , Ha)/(a- 12?) 
= gaye | wit ~ oe 
T a-l’ 
Further 
P((&-1)X®/r>y) = P(X >all+y/(a-1]|X > zx) 
= L(a[t+y/(a—1)]) x 
L(x) (x[1+y/(a—-1)])* 
1 


(1+y/(a—-1))*" 


We omit the proof of (c) and that EX) ~ 1/A(a). The remaining statement 
(1.8) in (b) then follows from 


P(A(z)X >y) 
= P(X>a+y/Xx)|X>x) = exp{A(x) — A(x +y/A(z))} 


= epf f Aa +z) de} = exp{ f A(x Aw) du} 
exp{—y(1 + o(1))}. 


II 


The property (1.7) is referred to as 1/A(x) being self-neglecting. It is trivially 
verified to hold for the Weibull- and lognormal distributions, cf. Examples 1.14, 
1.15. 

The mean excess function will play a main role later in Section 4 in connec- 
tion with finite-horizon ruin probabilities and in Section 6 in connection with 
tail estimation. 
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Notes and references Good general references for subexponential distributions 
are Embrechts, Kliippelberg & Mikosch [349] and Rolski et al. [746]. 

In the last decade, there has been a considerable literature on the theory of subex- 
ponential distributions. One direction is local subexponentiality, which in its simplest 
form has estimates for the density of the form b*” (x) ~ nb(x) and more generally gives 
conditions for B*" (x+y) — B*" (x) ~ n(B(x+y)—B(a)) for any fixed y. See, e.g., As- 
mussen, Foss & Korshunov [77]. Another direction is variants of the definition. Some 
of these are slight generalizations like intermediate regular variation, following up on 
Cline [248], others are slightly less general classes designed typically for an ad hoc 
purpose of pursuing some specific line of applications. The perspective of such studies 
may be a matter of taste. We would like, however, to point out one specific class which 
has proved rather robust, the class .7* C JZ originally introduced by Kliippelberg by 
the requirement 


[a B(x —y)By) dy ~ ws B(e). (1.9) 


For the intuition, note that B(x — y)/B(«x) — 1 for any y so one expects the integral 
divided by B(x) to have a limit ff B(y)dy = pe. However, there is nothing like a 
dominated convergence argument to justify this, and in fact (1.9) may fail in some 
exceptional cases. 


2 The compound Poisson model 


Consider the compound Poisson model with arrival intensity @ and claim size 
distribution B. Let S, = Bee U; — t be the claim surplus at time ¢ and 
M = sup;so St, T(u) = inf {t > 0; S; >u}. We assume p = Bug < 1 and 
are interested in the ruin probability y(u) = P(M > u) = aS oo). Recall 
that Bo denotes the stationary excess distribution, Bo(x =), B y)dy / uB. 


Theorem 2.1 If Bo E€ Z, then y(u) ~ Bolu). 

=p 
The proof is based upon the following lemma (stated slightly more generally 
than needed at present). 


Lemma 2.2 Let Y1,Y2,... be ii.d. with common distribution G € Z and let 
K be an independent integer-valued r.v. with az < œ for some z > 1. Then 
P(Yi +- +Yxg > u) ~ EK G(u). 


Proof. Recall from Section 1 that G*” (u) 


Gu) ~ nG(u), u > œœ, and that for each 
z > 1 there is a D < œo such that G*"(u) < G(u 


)Dz” for all u. We get 


a w 29 = YPK =n S = Dex = EK 
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using dominated convergence with X, P(K = n) Dz” as majorant. 
Proof of Theorem 2.1. The Pollaczeck-Khinchine formula states that (in the set- 


up of Lemma 2.2) M 2 Y,+---+Yx where the Y; have distribution Bo and K is 
geometric with parameter p, P(K = k) = (1 — p)". Since EK = p/(1 — p) and 
tz < oo whenever pz < 1, the result follows immediately from Lemma 2.2. 


The condition Bo € JZ is for all practical purposes equivalent to B € JX. 
However, mathematically one must note that there exist (quite intricate) exam- 
ples where B € Z, Bo ¢ YX as well as examples where B ¢ Z, Bo E€ Z. The 
tail of Bo is easily expressed in terms of the tail of B and the function e(x) in 
Proposition 1.18, 


= Si B(x)EX®) e(x) B(x 
Bo(x) = = Biy)dy = OES. a (2.1) 
HB Je HB HB 


In particular, in our three main examples (regular variation, lognormal, Weibull) 
one has 


=E L(x) = L(x) 
B ~ B AD EN 
(x) xe > olx) upla = Lael’ 
mn — [1l — = —(log x— u)? /20? 
Ba) =o een) + pede Baye 2 s 
o et +0?/2(log x)? V27 
apy =r? T(1/8) Dp 1 1-6 ..- B 
B(x) =e * u ; t)~ =r PeT., 
K a E 917) 


From this, Bo € “is immediate in the regularly varying case, and for the log- 
normal and Weibull cases it can be verified using Pitman’s criterion (Proposition 
1.13). In general it is known that B € .”* is sufficient for Bo € X. 

Note that in these examples, Bo is more heavy-tailed than B. In general: 


Proposition 2.3 If B€ .Y, then Bo(x)/B(x) — œ, x = ov. 
Proof. Since B(x + y)/B(x) — 1 uniformly in y € [0, a], we have 


B 
lim inf Bola) > lim inf —— = 3 


Let a > œ. 


Remark 2.4 Note that for regularly varying claim size distributions, one can 
also use the Pollaczeck-Khinchine formula and the Tauberian Theorem A6.2 


304 CHAPTER X. HEAVY TAILS 


(given in the Appendix) to provide a somewhat alternative proof of Theorem 
2.1. More precisely, combining IV.(3.4) and IV.(3.5) we have 


) 


p(Bol-s]- 1) _ (janes 


s(pBo[—s] — 1) ) (1 + pBo|—s] + pBo|-s|? + -- -) ; 


S 


Assume that B(x) ~ L(x)/x® with a > 1 and write a = n+ ņ with n = |a] 
and 0 < ņ < 1 (if 7 = 0 there are obvious amendments). Then from above 


= L(x) 
Bo(x) ~ ac 


T CO 


and with Theorem A6.2 this implies 


Bojs) = 14 BOM yO ozs 


for some constants a;. Hence 


v-3] = Po (RSE) tasts.) 


for some constants b; and after subtracting the resulting first n — 2 terms with 
powers s* (k = 1,...,n — 2) the r.h.s. (and correspondingly also the 1.h.s.) is 
regularly varying at s = 0 with index 7. Since ws] is the Laplace-Stieltjes 
transform of IN w(y) dy, another application of Theorem A6.2 shows that 


Co p Co a 
f| www f Bwa 
u =P Ju 
and by the Monotone Density Theorem 


va) ~ z Bol). 


Notes and references Theorem 2.1 was derived by several different authors and 
under varying assumptions. We mention here Teugels & Veraverbeke [842], von Bahr 
[122], Borovkov [182], Thorin & Wikstad [849], Pakes [677] and Embrechts & Veraver- 
beke [353]. 

The approximation in Theorem 2.1 is notoriously not very accurate. The problem 
is a usually very slow rate of convergence as u — oo. For some earlier numerical 
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studies, see Abate, Choudhury & Whitt [1], Kalashnikov [517] and Asmussen & Bins- 
wanger [72]. E.g., in [517, p.195] there are numerical examples where y(u) is of order 
10~° but Theorem 2.1 gives 107'°. This shows that although the approximation is 
asymptotically correct in the tail, one may have to go out to values of y(u) which are 
unrealistically small before the fit is reasonable. Second order terms were e.g. intro- 
duced in Abate et al. [1] and Baltrūnas [126, 127], but unfortunately the improvement 
is not very pronounced, see also Omey & Willekens [675, 676] for some related work. 
Based upon ideas of Hogan [473], Asmussen & Binswanger [72] suggested an approxi- 
mation which is substantially better than Theorem 2.1 when u is small or moderately 
large. For US-Pareto and classical Pareto claim size distribution, the explicit Laplace 
transform in terms of an incomplete Gamma function can be used to obtain an integral 
representation (with non-oscillating integrand on the real line) for the ruin probability, 
which can be seen as an ‘almost explicit’ formula, see Ramsay [725, 726] and Albrecher 
& Kortschak [33]. 

In recent years, there has been a lot of research activity on higher-order asymptotic 
expansions of general compound distributions (under certain additional assumptions 
on the tail), see e.g. Geluk, Peng & de Vries [393], Borovkov & Borovkov [184], Barbe, 
McCormick & Zhang [131, 132], Mikosch & Nagaev [639], Kortschak & Albrecher 
[557] and Albrecher, Hipp & Kortschak [29]. In [29] it is also shown that a shift 
in the argument can substantially improve the accuracy of the first-order asymptotic 
approximation in Theorem 2.1, see also the Notes to Section XVI.2a. For higher-order 
approximations for absolute ruin probabilities, see Borovkov [183]. 

As any approximation valid as u — oo, the one in Theorem 2.1 can of course not 
be precise for small u. Olvera-Cravioto, Blanchet & Glynn [674] discuss the alternative 
of using the heavy traffic approximation for small and moderate u and identify the 
threshold where the subexponential approximation takes over. 

Upper bounds for y(u) in the heavy-tailed case can be found in Kalashnikov [517, 
518] (see also Willmot & Lin [891, Sec.6.2]), but are in general quite complicated. 


3 The renewal model 


We consider the renewal model with claim size distribution B and interarrival 
distribution A as in Chapter VI. Let U; be the ith claim, T; the ith interarrival 
time and Xi = Ui s T: 


SM=X,+-.-4+X,, M= sup SY, V(u)= inf{n: so > u}. 
{n=0,1,...} 


Then y(u) = PLM > u) = P(W(u) < œ). We assume positive safety loading, 
Le. p = ug/pa < 1. The main result is: 


Theorem 3.1 Assume that (a) the stationary excess distribution Bo of B is 
subexponential and that (b) B itself satisfies B(x — y)/B(a) > 1 uniformly on 
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compact y-intervals. Then 


wu) ~ Top Bow) , Uso. (3.1) 
[Note that (b) in particular holds if B € A] 

The proof is based upon the observation that also in the renewal setting, 
there is a representation of M similar to the Pollaczeck-Khinchine formula. To 
this end, let V4} = (0) be the first ascending ladder epoch of fs, 


Gi(A) = P(S, E A, ð} < 00) = P(S,, € A, T4 < 00) 


where r} = Tı +--+ To, as usual denotes the first ascending ladder epoch 
of the continuous time claim surplus process {S;}. Thus G is the ascending 
ladder height distribution (which is defective because of ug < pa). Define 
further 6 = ||G4 || = P(0. < co). Then 


K 
um 2 SOY (3.2) 


where K is geometric with parameter 0, P(K = k) = (1—0)0* and Y1, Y2,... 
are independent of K and i.i.d. with distribution G+/0 (the distribution of Sy, 
given T+ < oo). As for the compound Poisson model, this representation will 
be our basic vehicle to derive tail asymptotics of M but we face the added 
difficulties that neither the constant @ nor the distribution of the Y; are explicit. 

Let F denote the distribution of the X; and F’; the integrated tail, Fr(£) = 
JZ Fly) dy, x > 0. 


Lemma 3.2 F(x) ~ B(x), £ > œ, and hence F7(x) ~ ug Bolz). 


Proof. By dominated convergence and (b), 


Fc) _ [° Bety) e 
Be n Bla) A(dy) f 1- A(dy) = 1. 


The lemma implies that (3.1) is equivalent to 


1 
P(M >u) ~ — Fr(u), u> œ, (3.3) 
lur] 
and we will prove it in this form (in XIII.2, we will use the fact that the proof 
of (3.1) holds for a general random walk satisfying the analogues of (a), (b) and 
does not rely on the structure X; = U; — T;). 
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Write Gi(z) = Gi(x,co) = P(S, > 2, 04 < oo). Let further V = 
inf{n >0: SM < 0} be the first descending ladder epoch, G_(A) = P(Sp_ € 
A) the descending ladder height distribution (||G_|| = 1 because of wp < pa) 
and let wg_ be the mean of G_. 


Lemma 3.3 G(x) ~ F7(2)/|ucg_|, £ > œ. 


Proof. Let R} (A) = E NG : 1S? € A) denote the pre- occupation mea- 
sure and let UL = 73° G*” be the renewal measure corresponding to G_. Then 


0 0 
Gile) = | Ple- Ra) = | Fæ- yU) 


—0O 


(the first identity is obvious and the second follows since an easy time reversion 
argument shows that R} = U_, cf. A2). The heuristics is now that because 
of (b), the contribution from the interval (—N,0] to the integral is O(F(x)) 
= o(Fr(x)), whereas for large y, U- (dy) is close to Lebesgue measure on (—oo, 0] 
normalized by |ug_| so that we should have 


= 1 ee 
G(x) [F y)dy = —— Fr(z). 
sia | |ua] 


We now make this precise. If G_ is non-lattice, then by Blackwell ’s renewal 
theorem U_(—n — 1,—n] — 1/|ua_|. In the lattice case, we can assume that 
the span is 1 and then the same conclusion holds since then U_(—n — 1, —n] is 
just the probability of a renewal at n. 

Given e€, choose N such that F(n — 1)/F(n) < 1+ € for n > N (this is 
possible by (b) and Lemma 3.2), and that U_(—n — 1,—n] < (1+ €)/|uc_] for 
n > N. We then get 


lim sup G4 (2) 
L200 r(x) 


< timsup f Fle = 8) cay) + imsup f ite a 


L—- 00 —N F';(a) ~LZ—00 —oo Fy(x) 
F 
< limsup (2) U_(-N,0] 
asco F7(z) 
fs 
+ lim sup =—~— 5 F(a+n)U_(—n—-1,-—n] 
zoo Ey(x) “= 
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1 Ite Goa 
< 0 + limsup = F(a+n) 
asco Fgr(x) rae 
1+)? ib. = (pies 
< cee) lim sup = 1 F(x + y)dy 
lua-| z>% Fr(x)Jn 
+€)? F N 1 2 
= azis lim sup ites ) = or) 
lua | z=% | F(z) luc_| 


Here in the third step we used that (b) implies B(x) /Bo(x) — 0 and hence 
F(x)/F r(x) — 0, and in the last that Fr is asymptotically proportional to 
Bo € & Similarly, 
a Lota 
lim inf Gale) > =g , 
eco F(z) |ua] 


Letting e | 0, the proof is complete. 


Proof of Theorem 3.1. By Lemma 3.3, P(Y; > x) ~ Fy(x)/(@|uc_|). Hence 
using dominated convergence precisely as for the compound Poisson model, (3.2) 
yields 


ea EN DO k en = Ge 


k=1 
Differentiating the Wiener-Hopf factorization identity (A.9) 


1- Pis] = (1-G_[s]) (1 - G4[s)) 
and letting s = 0 yields 


-ur = —(1- G4 [0] - (1-14 ll)uc_ = -G- uc. 
Therefore by Lemma 3.2, 


Fr(u) is Hp Bo(u) = pBo(u) 
(1-@)|ue_| Ha HB 1=p ` 


We conclude by a lemma needed in XIII.2: 
Lemma 3.4 For any a < œ, P(M > u, Ssu) — Sou-1 <a) = o(Fr(u)). 
Proof. Let w(u) = inf {n : SMe (u—a,u),M, < u}. Then 


P(M €(u—a,u)) > P(w(u) < c)(1— (0). 
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On the other hand, on the set {M > u, Sofu) — Sv(u)-1 S a} we have w(u) < o0, 
and {Soun — Sotu) eel must attain a maximum > 0 so that 


aie 


P(M >u, S9(u) — S9(u)—1 < a) < P(w(u) < oo) (0) 
¥(0) 
1 — (0) 


But since P(M > u — a) ~ P(M > u), we have 


IA 


P(M € (u—a,u)). 


P(M €(u—a,u)) = o(P(M>u)) = o(Fr(u)). 


Notes and references Theorem 3.1 is due to Embrechts & Veraverbeke [353], 
with roots in von Bahr [122], Pakes [677] and Teugels & Veraverbeke [842]. Asymp- 
totic results for maxima of random walks with heavy-tailed increments again carry 
over to corresponding statements for w(u,T) in the renewal model, cf. for instance 
Veraverbeke & Teugels [865] and more recently Baltrūnas [128]; see also Baltrūnas & 
Kliippelberg [129]. Further results on tails of the discounted aggregate claims in this 
model are given in Hao & Tang [447]. Wei & Yang [880] extend the integral represen- 
tation for the ruin probability for US Pareto claims to an Erlang renewal model. 

A recent reference containing much relevant information on heavy-tailed asymp- 
totics for random walks is Borovkov & Borovkov [185]. 


4 Finite-horizon ruin probabilities 


We consider the compound Poisson model with p = Bug < 1 and the stationary 
excess distribution By subexponential. Then y(u) ~ p/(1 — p)Bo(u), cf. Theo- 
rem 2.1. The asymptotic behavior of the finite-horizon ruin probability ~(u, T) 
for fixed T is trivially given by 


y(u, T) ~ BTB(u), (4.1) 


since the coarse inequality P(Ar — T > u) < y(u, T) < P(Ar > u) and Proposi- 
tion 1.5 in this case already suffice to see that y(u, T) ~ P(Ar > u) from which 
(4.1) follows from Lemma 2.2. It is therefore clear that one can expect deeper 
results only for the case when the time horizon itself scales with u, and this will 
be the subject of this section. 

As usual, 7(u) is the time of ruin and as in V.7, we let P™ = P(- | r(u) < 00). 
The main result of this section, Theorem 4.4, states that under mild additional 
conditions, there exist constants e(u) such that the P™-distribution of 7(w) /e(u) 
has a limit which is either Pareto (when B is regularly varying) or exponential 
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(for B’s such as the lognormal or DFR Weibull); this should be compared with 
the normal limit for the light-tailed case, cf. V.4. Combined with the approx- 
imation for y(u), this then easily yields approximations for the finite horizon 
ruin probabilities ~(u, e(u)) (Corollary 4.7). 

We start by reviewing some general facts which are fundamental for the 
analysis. Essentially, the discussion provides an alternative point of view to 
some results in Chapter V, in particular Proposition V.2.3. 


4a Excursion theory for Markov processes 


Let until further notice {S+} be an arbitrary Markov process with state space 
E (we write Py when So = x) and m a stationary measure, i.e. m is a (o-finite) 
measure on E such that 


J ORS, € A) = m(A) (4.2) 


for all measurable A C E and all t > 0. Then there is a Markov process {R+} 
on E such that 


f OOA AE i m(dy)k(y)Eyh(S.) (4.3) 
E E 


for all bounded measurable functions h,k on E; in the terminology of general 
Markov process theory, {S+} and {R+} are in classical duality w.r.t. m. 

The simplest example is a discrete time discrete state space chain, where we 
can take h,k as indicator functions, for states i,j, say, and (4.3) with t = 1 
means M;frij = M;8;; where r;;,8;; are the transition probabilities for {S+}, 
resp. {R+}. Thus, a familiar case is time reversion (here m is the stationary 
distribution); but the example of relevance for us is the following: 


Proposition 4.1 A compound Poisson risk process {R;} and its associated 
claim surplus process {S;} are in classical duality w.r.t. Lebesgue measure. 


Proof. Starting from Ro = x, Ry is distributed as x + t — M Ui, and starting 
from So = y, Sı is distributed as y — t + Bae U; (note that we allow x, y to vary 
in the whole of R and not as usual impose the restrictions x > 0, y = 0). Let G 
denote the distribution of U; — t. Then (4.3) means 


[from (a — z) da G(dz) = ffir h(y + z) k(y) dy G(dz) . 


The equality of the l.h.s. to the r.h.s. follows by the substitution y = x — z. 
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For F C E, an excursion in F starting from x € F is the (typically finite) 
piece of sample path? 


{Stho<tcw(Fe) | So=a where w(F°)=inf {t>0: & Z F}. 
We let QS be the corresponding distribution and 
zy = Qi (o | Sure- =y, wF’) < 00) , 


y € F (in discrete time, S.,(7¢)— should be interpreted as Sw(re)—1). Thus, Q2, 
is the distribution of an excursion of {5S;} conditioned to start in x € F and 
terminate in y € F. QË and OF are defined similarly, and we let Oe. refer to 
the time reversed excursion. That is, 


QRO) = P (re beatae ER So = T, Su( Fe) — = y) . 


Theorem 4.2 QS, = OF) 
The theorem is illustrated in Fig. X.1 for the case F = (—oo, 0], x = 0. 


4 ' 4 


w(0, 00) = 7 (0) ! 


Ne 


St 


FIGURE X.1 


The sample path in (a) is the excursion of {S;} conditioned to start in x = 0 and 
to end in y > 0, the one in (b) is the time reversed path. The theorem states that 
the path in (b) has the same distribution as an excursion of {R+} conditioned to 
start in y < 0 and to end in x = 0. But in the risk theory example (corresponding 
to which the sample paths are drawn), this simply means the distribution of the 
path of {R,} starting from y and stopped when 0 is hit. In particular: 


?In general Markov process theory, a main difficulty is to make sense to such excursions 
also when Pz(w(F°) = 0) = 1. Say {St} is reflected Brownian motion on [0,00), x = 0+ 
and F = (0,00). For the present purposes it suffices, however, to consider only the case 
Pz (w(F°) = 0) = 0. 
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Corollary 4.3 The distribution of T(0) given T(0) < œ, S-~)- =y < 0 is the 
same as the distribution of w(—y) where w(z) = inf {t > 0: Ry =z}, z >0. 


[Note that w(z) < œ as. when p = bug < 1] 


Proof of Theorem 4.2. We consider the discrete time discrete state space case 
only (well-behaved cases such as the risk process example can then easily be 
handled by discrete approximations). We can then view Qy as a measure on 
all strings of the form iotiı ...in with io, i1,..., in € F, io = £, in = y, 


Po(S1 = i1,- .., Sn = in = y; Sn4i E F°) | 
Pz (w(F°) < 00, Sw(re)-1 = Y) , 


Oe lioii- in) = 


note that 


Pz (w( F°) < 00, Sw(re)-1 = Y) 


= 5 5 Po (Si S412. Sn = in = Y; Sny € F°). 


Similarly, QS, and QR are measures on all strings of the form ig?]...%, with 
io, t1,- -în EF, to = y, in = T, 
Py (Ri 975 0 4 ls Sie =a: Rn+ı = F°) 

Py (w(F°) < OO, Ra(re)-1 = y) 


Rolioir...in) = 


and QS „(ioii ..-tn) = Q3 y (inin-1- i0). 


To show QR „(ioii ..-in) = QF y(inin-1 --- io) when io, i1,...,in € F, io = y, 


in = x, note first that 
P, (Ri Eigyr hn = ip Se Rn41 & F°) 


= Ta acute aes Trj 


jeFe 
_ Musio | Miz Sizir Min Sinin—1 y Mj Six 
Mio Mi Min 1 jere z 
1 
= Sinin—1 Six io 5 5 85a 
y jEFe 


Thus 
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Similarly but easier 


Stin-1 +++ Siy y Syj 


jEFe 


ò Sziki ° Siy y Syj 


lii,- ik-1EF jEFe 


Q y (ininzia i = 


Me 


x 
Il 


Sxin_1 -Siy 


y Sxip_1 -Siy 


1i,- ik-1EF 


Me 


x 
Il 


4b The time to ruin 


Our approach to the study of the asymptotic distribution of the ruin time is to 
decompose the path of {S+} in ladder segments. To clarify the ideas we first 
consider the case where ruin occurs already in the first ladder segment, that is, 
the case T(0) < 00, S+(0) > u. 

Let Y = Yı = S,, (1) be the value of the claim surplus process just after the 
first ladder epoch, Z = Zı = S,, (1) the value just before the first ladder epoch 
(these r.v.’s are defined w.p. 1 w.r.t. P()), see Fig. X.2. 


FIGURE X.2 


The distribution of (Y, Z) is described in Theorem IV.2.2. The formulation rel- 
evant for the present purposes states that Y has distribution Bo and that condi- 
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tionally upon Y = y, Z follows the excess distribution B®) given by BY) (x) = 
Bly + «)/Bly). 
We are interested in the conditional distribution of T(u) = T(0) given 


{7(0) < 00, S;(0) > y} = {7r(0)<w,Y>y}, 


that is, the distribution w.r.t. P“)) = P(- | 7(0) < 00, Y > u). Now the P(™- 
distribution of Y — u is Bm, That is, the P(:))-density of Y is B(y) /[upBo(u)], 
y >u. BE is also the P)-distribution of Z since 


_ [?_ By) By +a 
P(Z>a|Y >u) = i ugBolu) Bly) 


= f DU dy = BW (a). 


+a HB Bolu) 


Let {w(z)},.9 be defined by w(z) = inf {t > 0 : Ri = z} where {R;} is inde- 
pendent of {S;}, in particular of Z. Then Corollary 4.3 implies that the P(1)- 
distribution of t(u) = 7(0) is that of w(Z). Now Bo € Z implies that the 
BO (a) > 0 for any fixed a, ie. P(Z < a | Y > u) > 0. Since w(z)/z 
S 1/1 — p), z oo, it therefore follows that t(u)/Z converges in P(“1)- 
probability to 1/(1 — p). 

Since the conditional distribution of Z is known (viz. BW), this in principle 
determines the asymptotic behavior of T(u). However, a slight rewriting may be 
more appealing. Recall the definition of the auxiliary function e(x) in Section 1. 
It is straightforward that under the conditions of Proposition 1.18(c) 


BO (ye(u)) > PW >y) (4.4) 


where the distribution of W is Pareto with mean one in case (a) and exponen- 
tial with mean one in case (b). That is, Z/e(u) => W in P®®-distribution. 
T(u)/Z — 1/(1 — p) then yields the final result T(u)/e(u) — W/(1 — p) in 
P(“-1)-distribution. 

We now turn to the general case and will see that this conclusion is also true 
in P™)-distribution: 


Theorem 4.4 Assume that Bo € Z and that (4.4) holds. Then r(u)/e(u) 
— W/(1— p) in P™ -distribution. 


In the proof, let 74(1) = 7(0),74(2),... denote the ladder epochs and let 
Yk, Zk be defined similarly as Y = Yj, Z= = Zı but relative to the kth ladder 
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segment, cf. Fig. X.3. Then, conditionally upon 7,(n) < oo, the random vectors 
(Y1, Z1) ---, (Yn, Zn) are iid. and distributed as (Y, Z). 

We let K(u) = inf {n = 1,2,...: T4(n) < œ, Yı +--+ Yn > u} denote the 
number of ladder steps leading to ruin and P(“”) = P(- | r(u) < œ, K(u) = n). 
The idea is now to observe that if K(u) = n, then by the subexponential property 
Yn, must be large, i.e. > u with high probability, and Yj,...,Y,—1 ‘typical’. 
Hence Z,, must be large and Z1,..., Zn-1 ‘typical’ which implies that the first 
n—1 ladder segment must be short and the last long; more precisely, the duration 
T4(n) — T4(n — 1) of the last ladder segment can be estimated by the same 
approach as we used above when n = 1, and since it dominates the first n — 1, 
we get the same asymptotics as when n = 1. 


ri(1) (1) ` (1) 


FIGURE X.3 
In the following, ||- || denotes the total variation norm between probability 


measures and & product measure. 


— 0. 


Lemma 4.5 Pom ie Peon ea Bee Be 


Proof. We shall use the easily proved fact that if A’(u), A” (u) are events such 
that P(A’(u) A A” (u)) = o(P(A’(u)) (A = symmetrical difference of events), 
then 

IP(- | A’(u)) — P(-| A”(u)) |] — 0. 
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Taking A’(u) = {Yn > u}, 


A" (u) = {K(u) =n} = [Yi +++: t Yna Su, Yit---+¥, > uh, 


the condition on A’(u) A A” (u) follows from Bo being subexponential (Proposi- 
tion 1.2, suitably adapted). Further, P(-| A’(u)) = PC”), 


P(Y1,.--,¥a-1,¥n—u) €- | A"(u)) = BET) o Bm. 


Lemma 4.6 PO” (Za, .--;Zn) € -) = - ® Bw || > 0. 


Proof. Let (Y1, Z1),..-, (Y, ZL) be independent random vectors such that the 
conditional distribution of Z’, given Y/ = y is BY), k =1,...,n, and that Y{/ has 
marginal distribution Bo for k = 1,...,n — 1 and Y; — u has distribution B. 
That is, the density of Y/ is B(y)/[uBBo(u)], y > u. The same calculation as 
given above when n = 1 shows then that the marginal distribution of Z/, is Be, 
Similarly (replace u by 0), the marginal distribution of Z, is By for k < n, and 
clearly Z},...,Z/, are independent. Now use that if the conditional distribution 
of Z' given Y” is the same as the conditional distribution of Z given Y and 
PCY -)-P(Y’ D| 0, then ||P(Z € -)—P(Z’ € -)|| + 0 (here Y, Y’, Z, Z’ 
are arbitrary random vectors, in our example Y = (Y1,..., Yn) etc.). 


Proof of Theorem 4.4. The first step is to observe that K (u) has a proper limit 
distribution w.r.t. P“ since by Theorem 2.1, 


1 
PM (K (u) =n) = Ha Ue Oe <u, Yit +Y > u) 
1 
~ ———— "P(Y, >u) = (1- p) 
p/(1 = p) Bo(u) 
for n = 1,2,.... It therefore suffices to show that the P™:”)-distribution of 7(u) 
has the asserted limit. Let {w1(z)},..., {wn(z)} be iid. copies of {w(z)}. Then 
according to Section 4a, the P“-")-distribution of T(u) is the same as the P(“-”)- 
distribution of w1(Z1) +: --+wn(Zn). By Lemma 4.6, wk(Zķ) has a proper limit 
distribution as u — oo for k < n, whereas wn(Zn) has the same limit behavior 
as when n = 1 (cf. the discussion just before the statement of Theorem 4.4). 
Thus 


PM" (r(u)/e(u) >y) = PO ([wi(Z1) +++ + wn(Zn)]/e(u) > y) 
~ PC) (on(Zn)/e(u) >y) > P(W/(1—p) >y). 
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Corollary 4.7 u(u,e(u)T) ~ Pw .P(W/( = p) < T). 


For a growth rate d(u)T of the time horizon with d(u) = o(e(u)), Corol- 
lary 4.7 implies y(u, d(u)T) = o(y(u)). The following theorem gives some 
more explicit information for this case and identifies the bridge between the 
asymptotic behavior of y(u, T) of the order of B(u) (cf. (4.1)) and the one of 
v(u,e(u)T) of the order of Bo(u) (which is already the order of y(u)). 


Theorem 4.8 If B € .Y and Bo € Z, then for d(u) | co with d(u) = o(e(u)), 


b(u,d(u)T) ~ 6 B(u) du) T. 


Proof. Consider the random walk An = >>/_, & with E(€;) = 0 and distribution 
function F(x) = P(& < x) € Sand also its stationary excess distribution 
Fo € Z. Let MS := maxn<o(An — cn) for some stopping time o of the random 
walk and some constant c > 0. According to a result of Foss, Palmowski & 
Zachary [366], one then has 


P(MS > u) ~ ~ X P(o > n)F (u+ cn) (4.5) 


n>1 


as u — œ, uniformly over all stopping times ø. Let h be fixed and choose €; = 
Dr Uj — Bush (implying F € Z and Fo € S) and furthermore o = d(u) T 


Denote the ruin probability of the discrete-time process RY (n € N) (i.e. the 
Cramér-Lundberg process viewed at time points nh only) by y™. Relation 
(4.5) then translates into 


yr) (u,d(u)T) ~ 5 P(d(u) T > nh)F (u+ (1 — Bug)nh) 


n>1 


= 5 F(u +(1- Bup)nh). 
l<n<d(u) T/h 


It follows that an asymptotic upper bound for a”) (u,d(u) T) is 


d(u) T 
h 


F(ut+(1—Byp)h) = Tps U;>uth) 
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and similarly an asymptotic lower bound is 


; F(u+ (1 — Bpg) hd(u)T) 


= we (0 > ut (1- Bup)hd(u)T + Bueh) 


j= 

~ Bd(u)T B(u+ (1— Bug) hd(u)T + Buz h) 
X T B(u), 

where the last asymptotic relation uses B (u + d(u)) ~ B(u), which holds for 

d(u) = o(e(u)) under the stated assumptions on B (this property in fact char- 


acterizes the special role of the mean excess function e(u) in this context). Thus 


pr) (u, d(u) T) ~ Bd(u) T B(u). From 


A, —t)> A,, —nh Ar =t) =h 
eo gO se E E N 


one finally observes that for h — 0, Y™ (u, d(u) T) can be replaced by y(u, d(u)T). 


Notes and references Excursion theory for general Markov processes is a fairly 
abstract and advanced topic. For Theorem 4.2, see Fitzsimmons [363], in particular 
his Proposition 2.1. 

Most of the results of Section 5b are from Asmussen & Klüppelberg [86] who also 
treated the renewal model and gave a sharp total variation limit result. Extensions to 
the Markov-modulated model of Chapter VII are in Asmussen & Højgaard [80] and to 
Lévy processes in Klüppelberg, Kyprianou & Maller [543]. Theorem 4.8 can be found 
in Albrecher & Asmussen [12]. 

For extensions of (4.1) that hold uniformly for t in renewal models, see Tang [828] 
and Leipus & Siaulys [578]. 

Asmussen & Teugels [107] studied approximations of y(u, T) when T — oo with u 
fixed; the results only cover the regularly varying case. 


5 Reserve-dependent premiums 


We consider the model of Chapter VIII with Poisson arrivals at rate 8, claim 
size distribution B, and premium rate p(x) at level x of the reserve. 


Theorem 5.1 Assume that B is subexponential and that p(x) > œ, z > œ. 


Then Bly) 
om of aye vo 
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The key step in the proof is the following lemma on the cycle maximum of 
the associated storage process {V;}, cf. Corollary III.2.2. Assume for simplicity 
that {V;} regenerates in state 0, i.e. that f} p(x)~! da < co, and define the cycle 
as 

o = inf{t>0:Vi=0, max V,>0|Vo =o}. 
O<s<t 


Lemma 5.2 Define Mo = SUPo<t<o Vi. Then P(Mo >u) ~ BEo- Blu). 


The heuristic motivation is the usual in the heavy-tailed area, that Mo becomes 
large as consequence of one big jump. The form of the result then follows by 
noting that the process has mean time Eo to make this big jump and that it 
then occurs with intensity GB(u). More precisely, one expects the level y from 
which the big jump occurs to be O(1); the probability that this exceeds u is 
then B(u— y) ~ B(u). The rigorous proof is, however, non-trivial and we refer 
to Asmussen [64] (with a gap of that paper being filled in Asmussen et al. [83]). 


Proof of Theorem 5.1. We will show that the stationary density f(a) of {V;} 
satisfies 


fea, (5.2) 
p(x) 
We then get 
bu) = PV > u) =i. ty jay ~ 3 fo ZU) ay 


and the result follows. 

Define D(u) as the steady-state rate of downcrossings of {V;} of level u and 
D,(u) as the expected number of downcrossings of level u during a cycle. Then 
D(u) = f(u)p(w) and, by regenerative process theory, D(u) = Do (u)/u. Further 
the conditional distribution of the number of downcrossings of u during a cycle 
given Mo > u is geometric with parameter g(u) = P(M, > u | Vo = u). Hence 


D,(u) = P(M, > u) a BB(u) 
m (1 — q(u)) 1—q(u) ` 


Now just use that p(x) — oo implies q(x) — 0. 


fu)p(u) = D(u) = 


Notes and references The results are from Asmussen [64], where also the (easier) 
case of p(x) having a finite limit is treated. It is also shown in that paper that typi- 
cally, there exist constants c(u) — 0 such that the limiting distribution of 7(u)/c(u), 
given T(w) < oo, is exponential. An early reference for linear p(x) is Kliippelberg & 
Stadtmiiller [546]. Note that for linear p(x) and regularly varying claim size distri- 
bution, the result is consistent with the limit ø — 0 of X.(6.2). For extensions and 
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variants see Foss, Palmowski & Zachary [366] and Robert [741]. There are a number of 
further papers of Tang and co-authors dealing with aspects of ruin with subexponen- 
tial claims under interest force, e.g. [551, 829]. See also Kalashnikov & Konstantinides 
[519]. 


6 Tail estimation 


The fact that the order of ruin probabilities usually depends crucially on the tail 
and that the asymptotics are very different in light- and heavy-tailed regimes 
poses the problem of which distributional tail F to employ. Of course, this is 
a general statistical problem but definitely something that needs to be taken 
seriously. We give here only a brief introduction and refer in the Notes to 
standard textbooks for more detailed and broader expositions. 

When computing ruin probabilities, assumptions on the tail do only partially 
suffice — more precise estimates like the Cramér-Lundberg approximations with 
light tails or the subexponential approximation require the whole distribution, 
but we will not discuss here how to combine tail fitting with fitting in the whole 
support. 

We will consider the problem of fitting (with particular emphasis on the 
tail) a distribution F to a set of data X1,..., Xn» > 0 assumed to be i.i.d. with 
common distribution F. As usual, X(1),. - -, X(n) denote the order statistics. 

Inference on F(x) beyond x = X(n) is of course extrapolation of the data, 
and in a given situation, it will far from always be obvious that this makes 
sense. However, some extrapolation seems inevitable: because the empirical 
distribution F has the finite upper bound X(n), most methods are likely to 
underestimate the tail and in particular often postulate that it is light. 


6a The mean excess plot 


A first question is to decide whether to use a light- or a heavy-tailed model. 
The approach most widely used is based on the mean excess function 


e(x) = [X —2|X >a = Aw [ Fow 


introduced in Section 1. 

The reason that the mean excess function e(x) is useful is that it typically 
asymptotically behaves quite differently for light and heavy tails. Namely, for a 
subexponential heavy-tailed distribution one has e(x) — oo, whereas with light 
tails it will typically hold that lim sup e(x) < co; say a sufficient condition is 


F(z) ~ Uae? (6.1) 
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for some a > 0 and some ¢() such that (log x) is slowly varying (e.g., (a) = «7 
with —oo < y < ov). 
The mean excess test proceeds by plotting the empirical version 


1 
rO = e g, T 


j: Xj>x 


of e(x), usually only at the (say) K largest X;. That is, the plot consists of the 
pairs formed by X(n-x) and 


n 


1 
7 2, (Xw-Xn-w); 
l=n—k+1 
where k = 1,...,K. If the plot shows a clear increase to co except possibly at 


very small k, one takes this as an indication that F is heavy-tailed, otherwise 
one settles for a light-tailed model. 


Example 6.1 Figure X.4 contains the mean excesses of simulated data with 
n = 1,000 from six different distributions. Each row is generated from i.i.d. 
r.v.’s Yi, Yo,... such that X = Y; in the left column and X = Y, + Yo + Y3 
in the right. In row 1, Y is Pareto with a = 3/2; in row 2, Y is Weibull with 
B = 1/2; and in row 3, Y is exponential; the scale is chosen such that EY = 3 
in all cases. 
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To our mind, the story told by these pictures is not all that clear, so one 
conclusion is certainly that using the mean excess plot is not entirely straight- 
forward. 


6b Extreme values and POT 


Another method for making qualitative statements about the tail of the under- 
lying distribution F of the data is based on extreme value theory, more pre- 
cisely the results from that area that describe the asymptotics of the maximum 
Mn = X(n) and large order statistics like X(p_1),---;X(n—k)- 

To describe the method, we first recall the Fisher-Tippett theorem that 
states that when M,, can be scaled and centered such that (Mp — dn)/cn has 
a non-degenerate limit in distribution H for suitable constants cn, dn, then H 
must be of one of three types, the three classical extreme value distributions 
Fréchet, Weibull or Gumbel. Here a Fréchet limit typically occurs for very 
heavy-tailed distributions (in fact, if and only if F is regularly varying), whereas 
a Gumbel limit occurs for light-tailed distributions and moderately heavy-tailed 
distributions like the lognormal and Weibull tails e702” with b < 1; Weibull 
limits? need not concern us here since they only occur for distributions with a 
bounded support. 

The Fréchet c.d.f. is H(x) = e7% “, x > 0 and a > 0, and the Gumbel 
c.d.f. is H(z) = e°”, x € R. The qualifier ‘type’ above refers to the fact 
that obviously H can only be given up to scaling and location constants. It is 
customary to work in the class of generalized extreme value distributions defined 
by 
exp{—(1+ ér)-/8} €40, 1+êr>0 
exp{—e~*} €=0, =œ < T < œ% 


Ae(z) = { 


The particular reason for the normalization of the Fréchet c.d.f. (€ > 0) is to 
ensure continuity at € = 0, ie. He(x) — Ho(x) for all x as € — 0. The class 
of all possible limits is obtained by adding a location parameter u and a scale 
parameter ø > 0, i.e. by considering the He,,,,¢(x) = He((x — u)/o). 

A distribution F such that (Mn — dn)/Cn has limit H is said to be in the 
maximum domain of attraction of H. For applications, one can safely assume 
a given distribution F with infinite support to be in the maximum domain 
of attraction of either the Fréchet or the Gumbel (but there are exceptions, 
in particular discrete distributions like the geometric sometimes disturb the 


3Note that the Weibull extreme value distribution is the negative analogue of the classical 
Weibull distribution. 
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picture). This means that the distribution of Mp with k large is likely to be 
close to some He 4. 

The statistical procedure based on this observation is to obtain m approxi- 
mately i.i.d. replicates My1,..., Mk,m of Mp and use these as data for maximum 
likelihood estimation of £, 1,0. Writing n = km, the M;,; can be obtained by 
splitting the n observations into m blocks of size k and letting M;,; be the 
maximum over block i. The density he „o (£) of He n,o is non-zero only when 
1+ ¿(x — u)/c > 0 and is given by 


1 
~ o(1+ &(a — p)/0) 
(taking € > 0 for simplicity). The log likelihood is therefore 


hep,o(X) 


1/€+1 exp{—(1+ &(z — y/o) 5} 


m 


-mlogo — (1/€ +1) X` log(1 + €(Ma,i — u)/0) — YO (1 + €(Mei — w)/0) 8 


i=1 i=1 


and has to be maximized over the region 
> 0, o>0, tan 0% [min (1+ ElMr i — u)/0)}. 


Obviously, the maximization has to be done numerically. 2 

The most interesting parameter to estimate is €. Namely, an estimate € that 
is significantly larger than 0 indicates regular variation of F, one that is close 
to 0 indicates that most likely the tail is lighter than for regular variation. 

In practice the uncertainty on the estimates is usually high. One reason 
is that the block size k needs to be taken large in order that the M;,; have 
a distribution reasonably close to the asymptotics predicted by extreme value 
theory. Thus, the sample size m for fitting the parameters will be orders of 
magnitude smaller than the actual number n of observations. Consequently, the 
resulting estimates should be interpreted with great care. 

Due to the difficulty with the waste of data by blocking, another method is 
more popular in practice. It is based on the generalized Pareto distribution Ge g 


with tail i 


> E 
Ge ple) = 4 (1+ fx/8)é 
e7 2/6 €=0. 
Thus for € > 0, Gg g is the distribution of GX/€ where X has the standard 
Pareto tail (1+ 2)~° with a = 1/€, and Go,g is the exponential(1/@) limit as 
€ | 0. One has: 


Lemma 6.2 If X has distribution Ge g, then the distribution Fy of the over- 
shoot X — x | X > x is Ge giz) where B(x) = 8 + Ex. 
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The proof is elementary and omitted. Furthermore: 


Theorem 6.3 A distribution F is in the maximum domain of a generalized 
extreme value distribution He with € > 0 if and only if there exist constants 
B(x) such that 

lim sup| Fe (y) — Ge,aiz)(y)| = 0. 
The proof is also omitted, but is not elementary! 

As noted above, one can almost always safely assume a given distribution F 
with infinite support to satisfy the assumptions of Theorem 6.3. This motivates 
that for tail estimation, one assumes that Fy has a Ge g distribution for all large 
x (where 3 depends on x), selects some large but fixed threshold x and estimates 
E, B from the observations exceeding x. Then, letting N(x) = #{i: Xi > x}, 
the final estimate of the tail of F is 

Fw) = OTua), y>2, (6.2) 
where E B are the estimates of €, 8 = B(x). To obtain E, B, one lets Y; = Xj =g; 
i = 1,..., N(x) where jo = 0, ji = inf{j > ji-1 : Xj > x} and maximizes the 
log likelihood 


N(x) N(x) 
log ge,a(¥i) = —N(x)log 8 — (1/€+1) XL log(1 + €¥i/8), 
i=l i=l 


where ge g is the density of Ge g. 
The Y; represent peaks over the threshold x, and for this reason, the method 
and its extensions go under the name POT. 


6c The Hill estimator 


We now assume that F is either regularly varying, F(x) = L(x)/x%, or light- 
tailed satisfying (6.1). The problem is to estimate a. 

Even with L or £ completely specified, the maximum likelihood estimator 
(MLE) is not adequate in this connection, because maximum likelihood will try 
to adjust a so that the fit is good in the center of the distribution, without 
caring too much about the tail, where there are fewer observations. The Hill 
estimator is the most commonly used (though not the only) estimator designed 
specifically to take this into account. 

To explain the idea, consider first the setting of (6.1). If we ignore fluctua- 
tions in L(x) by replacing ¢(x) by a constant, the X; — x with X; > a are iid. 
exponential(a). Since the standard MLE of a in the (unshifted) exponential 
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distribution is n/(X1 +---+ Xn), the MLE a based on these selected X; alone 
is 
Hj : Xj >T 
yy: x> (Xj 5 x) 


The Hill plotis this quantity plotted as function of x or the number #j : X; > x 
of observations used. As for the mean excess plot, one usually plots only at the 
(say) k largest j or the k largest X;. That is, one plots 


k 
D neki (Xe) © X(n—k)) 
H 


as function of either k or X(j,—~). The Hill estimator a; ;, is (6.3) evaluated at 
some specified k. However, most often one checks graphically whether the Hill 
plot looks reasonably constant in a suitable range and takes a typical value from 
there as the estimate of a. 

The regularly varying case can be treated by entirely the same method, or 
one may remark that it is 1-to-1 correspondence with (6.1) because X has tail 
L(x)/x®“ if and only if log X has tail (6.1). Therefore, the Hill estimator in the 
regularly varying case is 


(6.3) 


k 
Yten—K41 (log Xy) — log X(n—ny) 


It can be proved that if k = k(n) — œ but k/n — 0, then weak consistency 


(6.4) 


alt k =, a holds. No conditions on L are needed for this. One might think that 
the next step would be the estimation of the slowly varying function L, but 
this is in general considered impossible among statisticians. In fact, we will see 
below that there are already difficulties enough with af g itself. 


Example 6.4 Figure X.5 contains the Hill plot (6.4) of simulated data (now 
with n = 10000) from the same six distributions as in Example 6.1 and the 
number k = 10,...,2000 of order statistics used on the horizontal axis. 

Of course, only the first row, Pareto(3/2), is meaningful, since the distri- 
butions in the remaining rows are not regularly varying. Nevertheless, the ap- 
pearance of the second row of plots, Weibull, is so close to the first that it is 
hard to assert from this alone that the distribution is not regularly varying (the 
evidence from the mean excess plot in Figure X.4 is not that conclusive either). 
The same holds, though maybe in a somewhat weaker form, for the exponential 
case in the third row. The first row also clearly demonstrates the difficulty in 
choosing k. Maybe one would settle for a value between 50 and 500 in the left 
panel, giving an estimate of a between 1.6 and 1.4. 
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The high statistical uncertainty on an estimate @ of a is of course reflected 
in a high uncertainty on the estimates of the tail we obtain by plugging in @ 
instead of a in the parametric expression for the tail. To quantify this point, 
consider again the Pareto(3/2) example, where it was not easy to assess from 
our simulation studies whether one would use @ = 1.4, 1.5 or 1.6 or values 
even further from 1.5. In the following table, we use these three a-values and 
compute the tail probabilities Faą(x) for the 4 z-values in the first row (chosen 
as the 99%, 99.9%, 99.99% and 99.999% quantiles of F).5). 


| 20.5 99.0 463 2153 
1.6 | 0.007 0.0006 0.00005 0.000005 
1.5 | 0.010 0.0010 0.00010 0.000010 
1.4 | 0.014 0.0016 0.00018 0.000022 


There is also a CLT kt/? (aH, — a) > N(0,a?). For this, however, stronger 
conditions on k = k(n) and L are needed. In particular, the correct choice of k 
requires delicate estimates of L. That L can present a major complication has 
also been observed in the many “Hill horror plots” in the literature. 


Notes and references Some standard textbooks on the topic are Embrechts, 
Klüppelberg, & Mikosch [349], McNeil et al. [633], Resnick [738], Beirlant et al. [154 
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and de Haan & Ferreira [284]. Another simple technique to estimate a € (0,2) from 
iid. observations is given in Albrecher & Teugels [37]. 

One should note that the area is rapidly expanding and that much literature re- 
cently also deals with dependence contexts which are not considered here. 
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Chapter XI 


Ruin probabilities for Lévy 
processes 


1 Preliminaries 


An important family of stochastic processes arising in many areas of applied 
probability is the class of Lévy processes. A process X = {X;},5, is said to 
be a Lévy process if it has D-paths and stationary and independent increments. 
Often one requires also Xo = 0, but at some instances we will also allow for 
starting values Xo = u Æ 0 and then write P, for the governing probability 
measure (if u = 0, we simply write P). For the purposes of ruin theory we 
will usually think of X, as claim surplus S; at time t, in which case indeed 
So = 0 and as usual, the ruin time is then 7(u) = inf {t > 0: X, > u} and the 
infinite horizon ruin probability is (u) = Po(7(u) < co). At some points it 
will alternatively be convenient to think of X, as the reserve process R, with 
starting value Ro = u > 0. One easily checks that under the D-path assumption 
the strong Markov property holds for Lévy processes (see e.g. [APQ, p. 35]). 


Standard Brownian motion B is a Lévy process, and so is a Brownian motion 
{ut + oB,} with general drift and variance parameters. A further fundamental 
example is the counting process Ng of a Poisson process, where ( is the rate. 
In fact, one of the central results in the foundations of Lévy processes is that 
any Lévy process can be represented as an independent sum of a Brownian 
motion and a ‘compound Poisson’-like process. In particular, any Lévy process 
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exhibiting finitely many jumps per unit time can be represented as 


No(t) 
X, = pwt+oB.+ >> Y; 
i=l 


for t > 0, where the Y; are i.i.d. and independent of B, Ng. This covers in 
particular the compound Poisson claim surplus process, where o? = 0, u = —1 
and the Y; are positive. However, there are Lévy processes for which the non- 
Brownian jump component J = {J(t)},., exhibits infinitely many jumps per 
unit time. Dealing with such processes is the main topic of this chapter. 

The jump process J is characterized by its Lévy measure v(dx), which can 
be any non-negative measure on R satisfying v({0}) = 0 and 


E (y? A 1) v(dy) < oo. (1.1) 


= 


Equivalently, Sise” v(dy) and ff, y? v(dy) are finite for some (and then all) 
e>0. 

A rough description of J is that jumps of size y occur at intensity v(dy). In 
particular, if v has finite mass A = lise v(dy) < oo, then J is a compound 
Poisson process with intensity À and jump size distribution v(dy)/A. In general, 
for any bounded interval K separated from 0, the sum of the jumps of size 
€ K in the time-interval [s,s + t) is a compound Poisson r.v. with intensity 
t\x = tf,,v(dy) and jump-size distribution v(dy)I(y € K)/Ax. Jumps in 
disjoint intervals are independent, and so we can describe the totality of jumps 
by the points in a planar Poisson process N(dy,dt) with intensity measure 
v(dy) & dt. A point of N at (Y;,7;) then corresponds to a jump of size Y; 
at time T; for J. If in addition to (1.1) one has 


a (Iyl A 1) v(dy) < œ (1.2) 


(this is equivalent to the paths of J being of finite variation), one can simply 
write 


dJe = | y N(dy, ds). (1.3) 
Rx (0,¢] 


If (1.2) fails, this Poisson integral does not converge absolutely, and J has to be 
defined by a compensation (centering) procedure. For example, letting 


Y(t) = f y N(dy,ds), Yq(t) = i ENIA; 
{v: |y|>1} x [0,¢] lylE(Yn+1:Yn] 
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one can let 


I(t) = Yot) + XO {¥n(t) — EY, (t) }, (1.4) 


where 1 = yı > y2 >+- | 0 and 


Yn (t) = Ef yv(dy). 
lylE(Yn+1:Yn] 


The series converges a.s. since 


co CO 1 
XO Var(¥,(t)) = de / y’v(dy) = | y’v(dy) < œ, 
n=1 n=1 lylE(Yn+1:Yn] -1 


and the sum is easily seen to be independent of the particular partitioning {yn}. 
But note that since the role of the interval [—1, 1] is arbitrary, a compensated 
Lévy jump process is given canonically only up to a drift term. 

If J, has non-decreasing paths, then J is called a subordinator. The Lévy 
measure for a subordinator necessarily satisfies (1.2), and any Lévy jump pro- 
cess satisfying (1.2) can be written as the independent difference between two 
subordinators, defined in terms of the restriction of v to (0, o0) and, respectively, 
the restriction of v to (—co,0) reflected to (0,00) (possibly a positive drift term 
has to be added). 

The property of stationary independent increments implies that log Ee’*+ 
has the form tk(r). Here «(r) is called the Lévy exponent (also often referred 
to as Laplace exponent); its domain includes the imaginary axis R(r) = 0 and 
frequently larger sets depending on properties of v, say {r : R(r) < 0} in the 
case of a subordinator. Thus, «(r) is the cumulant g.f. of an infinitely divisible 
distribution, having the Lévy-Khinchine representation 


ee: o0 
K(r) = cr + 5 + / (e — 1 —ryI(|y| < 1)) v(dy), (1.5) 
where one refers to (c,o?,v) as the characteristic triplet. 

In the finite variation case (1.2), the Lévy-Khinchine representation (1.5) is 
often written 


K(r) = cart = + T (e — 1) v(dy), (1.6) 


—oCo 


where cı = c— ft, yu(dy) 


1Note that in much of the literature on Lévy processes c or cı is referred to as the drift, 
whereas in the sequel we will refer to E[X1] as the drift. 
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The Lévy exponent’s derivatives at 0 give the cumulants of X1. In particular, 
UX, = tk’ (0), VarX, = tK” (0). 


Notes and references Bertoin [157] and Sato [763] are the classical references 
for Lévy processes, but there are also some good recent texts such as Applebaum [49] 
and Kyprianou [564]. A good impression of the many directions into which the topic 
has been developed and applied can be obtained from the volume edited by Barndorff- 
Nielsen et al. [137]. 

An early appearance of using Lévy processes (beyond the compound Poisson model 
and diffusion) for risk reserve modeling is Dufresne, Gerber & Shiu [335], for a more 
recent discussion see Morales & Schoutens [650]. Apart from mathematical elegance 
and generality, one often used argument to justify the use of these more general Lévy 
processes for risk modeling is that they lead to explicit knowledge of the distribution of 
aggregate claims (by construction via the infinitely divisible generating distribution), 
so instead of modeling the individual claims and compounding them, here the approach 
is rather to model (and eventually calibrate) the aggregate claim and then break down 
this information to infer consequences about the behavior on the individual claim level. 
Models with infinitely many jumps in finite time intervals are obviously not directly 
useful in the claims modelling. At the same time, the idea to calibrate the aggregate 
effects directly and to check the suitability of the resulting model through robustness 
and recalibration techniques is very popular in quantitative finance and may have a 
certain degree of attractiveness in the insurance context as well. 


la Special Lévy processes 


In the examples we treat, the Lévy measure will have a density w.r.t. Lebesgue 
measure, which we denote by n = dv/da. The density of X, is denoted by f(x) 
throughout. 


Example 1.1 For 1 < a < 2,a Æ 1, the a-stable Salc, B, p) distribution is 
defined as the distribution with c.g.f. of the form 


K(r) = —o°|r|% (1 — Psign(r/i) tan 5) + ru, Rr=0, 


for some o > 0, 8 € [—1,1], and u € R. There is a similar but somewhat 
different expression, which we omit, when a = 1. The reader should note that 
the theory is somewhat different according to whether 0 < a < 1, a = 1, or 
l1<a<2. 

If the r.v. Y has an Sa(c, 8, p) distribution, then Y +a has an Salc, 3, u+a) 
distribution and aY an Sa(c|a|,sign(a)8, ap) distribution. Thus, p is a transla- 
tion parameter and ø a scale parameter. The interpretation of 8 is as a skewness 
parameter, as will be clear from the discussion of stable processes to follow. 
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A stable process is defined as a Lévy jump process in which X; has an a-stable 
Salo, 3,0) distribution. This can be obtained by choosing the Lévy density as 


Cas y > 0, 
= 1.7 
ii ie y <0, me 


with 


c l-—a 

+ = T@—a)cos(na/2) | 
One can reconstruct 3 from the Lévy measure as 3 = (C4 — C_)/(C, +C_). 
If 0 < a < 1, then (1.2) holds and the process can be defined by (1.3). If 
1 <a < 2, compensation is needed and care must be taken to choose the drift 
term to get u = 0. Stable processes have a scaling property (self-similarity) 


altB 
2 


similar to Brownian motion, {T7~!/*Xr},., Z {Xt}is9 (u = 0 is crucial for 
this!). 7 7 

Stable processes and some of their modifications are treated in depth in 
Samorodnitsky & Taqqu [764]. 


Example 1.2 An important property of stable processes is that the Lévy den- 
sity and hence the marginals have heavy tails. A modification with light tails 
corresponds to the Lévy density 


Ce eight eS 0, 
Cee agent poke 
C_e& /Ja| x <0, 


where Cy >,C_ > 0,C} +C- > 0,G,M >0,0< Y < 2. Such a Lévy 
process is called a tempered stable process. For Y > 0 and C4} = C- = C, 
the corresponding Lévy process is called the CGMY process?; for Y = 0 and 
C4 = C_ = C, the process is called the Variance Gamma process. The Lévy 
exponent is 


k(r) = CyT(-Y)[(M - r)” - MY] + C_[(@+r)¥ -—@"]. 


Example 1.3 Since the Gamma distribution with a density proportional to 
x°—le— is infinitely divisible, there is a Lévy process with this distribution of 
Xı. For obvious reasons, it is called the Gamma process. The Lévy measure 
can be shown to have density n(x) = ae~**/x for x > 0; note that n(x) ~ x7}, 
x | 0, so the Lévy measure is infinite but at the borderline of being so. Hence 


2CGMY = Carr-Geman-Madan-Yor; cf. the notation for the parameters! 
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small jumps play a relatively small role for the Gamma process. By standard 
properties of the Gamma distribution, 


À t 
kr) =alog ya, Ale) = Fay? 


The Variance Gamma process in Example 1.2 is the difference between two 
independent Gamma processes. 


Example 1.4 The Normal Inverse Gaussian (NIG) Lévy process has four pa- 
rameters a, ô > 0, 8 € (—a,a), u € R, and 


K(r) = pr— öfa — (B +r)? — ya? — 2) ; 
The Lévy measure has density 


að 
|a| 


(here as usual K, denotes the modified Bessel function of the third kind with 
index 1), and the density of X, is 


Ky(alaz|)e*", «eR, (1.8) 


? 


file) = Žopfiva -P ay} VETE) g: 


6? + (x — u)? 


which is called the NIG(a, 3, u,6) density; clearly the density f(x) of X; is 
NIG(a, B, tu, td). 


Example 1.5 Let X be any Lévy process with nonnegative drift. Then T(x) = 
inf {t: X(t) > x} is finite a.s., and clearly {T(x)},.9 has stationary indepen- 
dent increments, so it is a Lévy process (in fact a subordinator, since the sample 
paths are nondecreasing). 

The most notable example is the Inverse Gaussian Lévy process, which cor- 
responds to X being Brownian motion with drift y > 0 and variance 1. Here 


hlt) Sea =e eos a 
x = Ban P37 IG Y 
(cf. also Corollary III.1.6) and the Lévy measure has density 


1 
V2r «3/2 


2 
e777 /2 x >0. 


n(x) = 
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1b Exponential change of measure 


As for the compound Poisson model and random walks, exponential change of 
measure also plays a main role for Lévy processes. It is also clear from the 
analogy with these classical models what should be the appropriate definition of 
an exponential 6-tilting for a @ satisfying «(0) < co: change the Lévy exponent 
K(r) to Ke(r) = K(r + 6) — K(6). 

Proposition 1.6 Assume that X has characteristic triplet (c,o7,v). Then 
Ko(r) is the Lévy exponent of the Lévy process with characteristic triplet (co, 0%, vo), 
where 03 = o°, ve(dx) = e** (dz), and 


1 
co = c+ o70+ | (e®® — 1)x v(dz). 
-1 


Proof. In view of (1.5) we have 
Kko(r) = K(r+@)—«(6) 
= (c+o76)r + 0777/2 + / (e® +2 _ 9 _ reI(|e| < 1)) v(dz) 


(c+o%0+ f (o% — 1)x v(dz))r + or? /2 


+f (e7 —1—raI(|z| < 1)) e°” v(da). 


=00 


Letting P be the governing probability measure for X and P® the one for the 
exponentially tilted Lévy process, the likelihood ratio on [0, T] takes the form 


dP — o-9X(T)+TK(6) 


dPOlF, 
as may be seen, for example, by discrete-random-walk approximations. 
Often exponential change of measure and other calculations involve roots r 
of an equation of the form «(r) = 6 where « is the Lévy exponent and ô > 0. 
By convexity and «(0) = 0, k(r) = 6 has (depending on the domain of existence 
of x) either zero, one or two real roots and we denote by —ps the smallest one. 


Notes and references Many of the examples mentioned above are frequently used 
for asset price modeling in finance. The NIG Lévy process and the Variance Gamma 
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process are particular examples of generalized hyperbolic Lévy processes. The general- 
ized hyperbolic distribution Y was introduced by Barndorff-Nielsen [135] as a normal 
variance-mean mixture (e.g. both the mean and variance of the normal distribution 
are distributed according to an (appropriately scaled) mixing distribution W) with a 
generalized inverse Gaussian mixing distribution W. For the NIG distribution W is the 
inverse Gaussian distribution, for the Variance Gamma distribution W is a Gamma dis- 
tribution, giving the corresponding processes their names. Another popular approach 
is to interpret the NIG and VG Lévy process as Brownian motion subordinated by 
an inverse Gaussian and a Gamma process, respectively. For details see e.g. Bibby & 
Sgrensen [164] or Schoutens [783]. 


2 One-sided ruin theory 


In this section, we give the results (both asymptotic and exact) for the infinite 
horizon ruin probability y(u) that can be derived with reasonable effort. We 
assume throughout that the claim surplus process {.S;} is a Lévy process with 
negative drift, i.e. «’(0) < 0 and, to avoid trivialities, that it is not the negative 
of a subordinator (in which case y(u) = 0 for all u > 0). 

Going beyond the compound Poisson model to general Lévy processes, heavy 
tails are a remarkably simple case, and we have the following analogue of results 
from Chapter X: 


Theorem 2.1 Assume that {S;} is a Lévy process with ES, = «K'(0) < 0 and 
the Lévy measure v satisfying U(x) = ie v(dy) ~ B(x) as £ — œ for some 


distribution B such that the integrated tail Bo of B is subexponential. Then 


1 mee 
as | Mev. (2.1) 


plu) ~ 


Lemma 2.2 P(S1 > x) ~ D(a). 


Proof. Write S = S + S” + S” where S’, S”, S” have characteristic triplets 
(c,07,v"), (0,0, v”) and (0,0, v”), resp., with v’, v”, v” being the restrictions of 
v to [-1,1], (—co, —1) and (1,00), respectively. 

With 6” = D(1), the r.v. S} is a compound Poisson sum of r.v.’s, with 
Poisson parameter 3” and distribution v” /8™”. Thus by X.2.1, we have 


P(S > 2) ~ BM” = p(z), a>. 


The independence of SY and S{’ > 0 therefore implies 


P(S} + Si’ >x) ~ Ua), 
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cf. the proof of X.3.2. From (1.5) it is immediate that «’(r) < oo for all r. In 
particular, Sj is light-tailed, and the desired estimate for Sı = S{ + SY + SY 
then follows by X.1.11. 


Proof of Theorem 2.1. Define 


M? = sup Sn, M = sup S. 
n=0,1,2,... 0<t<oo 
Then i an 
PUM > u) ~ ef my) dy (2.2) 
421 u 


by the general random walk result X.(3.3) and Lemma 2.2. Also clearly P(M¢ > 
u) < P(M > u) = (u). Given € > 0, choose a > 0 with P(info<i<i St > —a) > 
1— e. Then P(M4 > u -— a) > (1—6)P(M > u) = (1 — e)ẹp(u). But by subex- 
ponentiality, P(M?¢ > u — a) ~ P(Mĉ > u). Putting these estimates together 
completes the proof. 


Let us now move to general tail behavior of the Lévy measure, but restrict to 
one-sided jumps only. A Lévy process {S} is called spectrally negative if there 
are no positive jumps, i.e. v(0,00) = 0. Equivalently, the paths are skipfree 
upwards. In this case, (u) is in fact of exact exponential form: 


Theorem 2.3 Assume that the claim surplus process {S+} is spectrally negative 
with ES, <0 (i.e. ruin can only be caused by diffusion). Then (r) as defined 
in (1.5) has a positive zero y > 0 and y(u) =e", u> 0. 


Proof. Spectral negativity implies K(r) < co for all r > 0, and since K(1r) — co as 
r — co and x’(0) < 0, the desired root y exists by continuity. Under the change 
of measure with tilting factor y, E,S; = k’(7) > 0 so that P,(7(u) < œ) = 1. 
Due to the absence of up-jumps we have S,(,,) = u and correspondingly 


plu) = P(r(u)<oo) = E,[e"™; r(u) <o] = e". 


If {S;} is spectrally positive (i.e. v(—co,0) = 0), y(u) is not explicit but the 
Laplace transform can be found in closed form: 


Theorem 2.4 Assume that the claim surplus process {5} is spectrally positive 
with ES; = u < 0. Then (at least) for all s with R(s) > 0 


pl-s] = [ewan See, (2.3) 


8  k(—s8) 
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Proof. Define M = max;>o Sı and recall that the distribution of M is also the 
stationary distribution of the Lévy process reflected at 0, 


Vi = Vo + Si + Le, (2.4) 


where L; = (—infs<t Ss —~Vo)* (cf. [APQ, p. 250]). Taking Vo = M* where M* 
is an independent copy of M, {V;} becomes stationary. In particular, EV; = EVo 
and therefore (2.4) yields EL, = —y. Further, by spectral positivity {L+} is 
continuous so that the Kella-Whitt martingale becomes 


t 
n(n) f eM du + e% — eVi + rla. 
0 


Optional stopping at t = 1 gives 


0 = K(r)Ee™ + Ee™ — Ee™ + rEL, = «(r)Ee™ — rp. 


w) 
= 
l 
Z 
l 
33 


Remark 2.5 Note that for the Cramér-Lundberg model x(r) = 6( 
and u = bups — 1, so that (2.3) then simplifies to 


T = ob 1— uB 
Ws BE B48 


which indeed coincides with IV.(3.4). 


Next consider light tails for the up-jumps, meaning [,* e"’v(dx) < co for 
some r > 0 (which is the same as K(r) < oo). If ES; < 0 and the adjustment 
coefficient y > 0 (i.e. the positive root of k(r) = 0) exists, one expects from the 
Cramér-Lundberg theory that w(u) decays asymptotically exponentially at rate 
y. Indeed, in the general case: 


Theorem 2.6 Consider a general Lévy process {S;} with ES; < 0. Assume 
that y > 0 exists and satisfies K'(y) < œœ, and further that {S,} is not a com- 
pound Poisson process with lattice support of v. Then y(u) ~ Ce~™, u > oo, 
for some constant 0 < C < oo. 


2. ONE-SIDED RUIN THEORY 339 


We will see that the proof is straightforward, given some small technicalities 
on the non-lattice property. However, the really difficult step in the Cramér- 
Lundberg theory for Lévy processes is identifying C (for this one needs the 
Wiener-Hopf factorization briefly discussed in Section 4). Here we note only the 
following special case, which comprises the compound Poisson case and where 
the expression for C is entirely analogous to the one there, cf. IV.5.5: 


Corollary 2.7 If in the spectrally positive case y > 0 exists and k'(y) < œœ, 
then C = —pi/n!(y) = —n!(0)/«(1). 


Proof. Because of Theorem 2.6 and Remark IV.5.6 it suffices to calculate the 
constant C = limy.. e™w(u) = lims—>o s Y|—s +7]. In view of Theorem 2.4 
the result hence follows by a simple application of L’H6pital’s rule. 


Proof of Theorem 2.6. The spectrally negative case is covered by Theorem 2.3. 
In the (additional) presence of positive jumps, let E(x) = S,(~) — x denote the 
overshoot and 


Y = Y, = inf {z >1: €(2-)=0}, Y, = inf {z > 1+ Yn1: &(a—-) =0}. 


Then the Y, are finite P,-a.s. since then E} S1 > 0 and hence T(x) < oo for all x, 
and E,S; > 0 implies that there exists an infinity of x with E(x) = 0 [note that 
we cannot use x > 0 in the definition instead of x > 1 since it may then happen 
that Yı = 0 a.s.]. Thus {ElL) t>o is a regenerative process with regeneration 


points Y1, Y2,.... The assumption that S is not a compound Poisson process 
with lattice support of v is easily seen to imply that the distribution of Yj is 


non-lattice (see Kyprianou [564] for details). Hence (x) 2 &(oo) for some 
€(00) < oo, and using exponential change of measure, we get 


w(u) = by [e Se (u) « T(u) < 00] = Eye VIr) = e7 Lye Elu) u Ce 


where C = Lye Telce), 


Notes and references Asymptotic results on ruin probabilities for Lévy insurance 
risk processes can be found in Kliippelberg, Kyprianou & Maller [543]. For Theorem 
2.1, see also Kliippelberg & Kyprianou [542]. Corollary 2.7 goes back to Doney [825], 
where it is given as a consequence of a more involved argument. Huzak et al. [490] 
derive a ladder-height decomposition of the ruin probability and in that way generalize 
the Pollaczeck-Khinchine formula to certain Lévy set-ups, see also Schmidli [776]. At 
the same time, this implies that one can formulate a defective renewal equation for the 
ruin probability, see also Section XII.4. Bernyk, Dalang & Peskir [156] use fractional 
derivatives to derive accurate information on finite-time ruin probabilities for a-stable 
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Lévy processes (1 < a < 2) with only down-sided jumps. For extensions to more 
general Lévy processes with two-sided jumps, see for instance Bertoin & Doney [158] 
and Lewis & Mordecki [582]. 

For asymptotic results on finite-time ruin probabilities, see Palmowski & Pistorius 
[679]. 


3 The scale function and two-sided ruin prob- 
lems 


The concept of a scale function as discussed in II.2 for diffusions and giving 
two-sided exit probabilities generalizes to Lévy processes. One even can go one 
step further and include information on the exit time as well. So, let {X;}>0 
be a Lévy process with Lévy exponent «(r), and for 0 < u < a, define E 


To = inf{t>0: X% <0|Xo=u}, TH = inf{t>0: X,>a|Xo =u}. 


To avoid trivialities, we assume that {X} is not a subordinator or the negative 
of a subordinator. We will also need the assumption of spectral negativity (this 
ensures (s) < oo for all s > 0); so in this Section {X+} refers to the reserve 
process {R,}. Of course, results for the spectrally positive case (and thus for 
the claim surplus process {S;}) follow immediately by sign reversion. 

For ô > 0, the equation «(s) = ô has a unique positive solution which we 
denote by ps > 0. If 6 = 0 and E(X)) = «’(0) > 0, then ps = po = 0. Note 
that since we now deal with R; (rather than S+), the sign of the argument of Kk 
is reversed,’ so the positive solution is now not the adjustment coefficient! 


Lemma 3.1 E,,[e~°" ; rf < oo] = 7 5(@-¥), 


Proof. As a simple adaptation of Lemma X.3.1, just note that {ersXe— dt} isa 
martingale, apply optional stopping at r} AT and let T — oo with dominated 
convergence. 


Theorem 3.2 (a) For each 6 > 0, there exists a function W)(u) (the scale 
function) such that 


WwW) (u) 
W) (a) ` 


ri. + -] — 
Ta < To | = 


MG (3.1) 


(b) W)(u) is unique up to a multiplicative constant, which may be chosen such 
that W)(u) is given via its Laplace transform in u by 


J eS" W) (u) du = ot for s>ps. (3.2) 
F k(s)— ô 


3which we emphasize by using the argument s instead of r. 
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Note that taking ô = 0 we obtain the probability of exiting the interval (0, a) to 
the right as W(u)/W (a). Note also that the one-sided survival probability 
can be computed by taking the limit a — oo. In particular, if «’(0) > 0, then 
limg oo W (a) = lims—o s/&(s) = 1/K' (0), so that 


b(u) = 1-6/0) WO (u) 
(if K'(0) < 0, Hu) = 1). 
A further fundamental function for two-sided ruin problems is 
ZO(u) = 1+8 f W)(y) dy. (3.3) 
0 


In fact: 


Theorem 3.3 (a) 


ule 5 m9 < oo] = Zu) — Wu). (3.4) 
(b) n 
mle ne <a] = O-O a 65) 


In the proofs, we will need the running minimum and maximum, 


Further es will denote an exponential r.v. with rate 6 and which is independent 
of the Lévy process {X}. es will become useful via the following lemma: 


Lemma 3.4 Eees = — f lo ie = ô p-s. 
pats ps ô — K(s) 


Proof. Using exponential change of measure, we get 


P(Xe, >a) = P(rt < es) 
= Eye EEO, od <e) 
= La fe pears bq Ora ] = esa 


Le., Xe; is exponentially distributed with parameter ps which is equivalent to 
the first statement of the lemma. 
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For the second, we use the Kella-Whitt martingale M; (say) with exponential 
parameter —s on Z; = —X,+ Ly with Ly = —info<s<t -Xs = SUPo<s<t Xs. Le., 
Z is — X reflected at 0 which by the continuous-time analogue of II. (3.2) implies 
that 


Zt Z ias —Xı = =X; 
O<v<t 


Note that spectral negativity implies that L can only increase when Z is at 0 
and that L has no jumps. Therefore 


t t 
M = n(s) f e7% du + 1 — e75% — sf L(dv). 
0 0 


Since optional stopping at an independent random time is permissible for any 
martingale, we have 


0 = Mo =EM.,. (3.6) 


es co 1 x 
| e §4e dy = J e e784 dy = - Eees, 
0 0 ô 


Using the just established fact that Xe, has an exponential(ps) distribution, 
(3.6) therefore becomes 


Here 


s 
re žes 4.1 — Eees — > 


ô Ps 


which gives the desired conclusion concerning X,- 


Proof of Theorem 3.2(a) when K'(0) > 0, 6 =0. 

The assumption «’(0) > 0 ensures that -X „ is finite a.s., and we can define 
Ww (u) =P. (X,, > 0). Sample path arguments beyond the scope of this book 
show that either at 7) or immediately after, X will attain strictly negative values 
(one needs to consider the case of a Brownian component or none separately; 
see [564, pp. 216, 177-179]). Therefore, under the event 7) < T,X% > 0 is 
impossible so that by the strong Markov property 


Pile 0) = BP ike Oa < T | = Pole PG < To), 


which gives the desired conclusion in the form 


O(n 
Pater <i) = O (3.7) 
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Proof of Theorem 3.2(a) when 6 > 0 or K/(0) < 0, 6=0. 
In this case ps > 0 and we can use exponential tilting with factor ps and define 


W(u) = Pups(Xoo 20), WO (u) = etw (u). (3.8) 


Easy convexity arguments show that the P,,-drift is positive, so by (3.7), 
(3.9) 


But using the exponential change of measure, the l.h.s. can also be written as 


vu [exp{p5(X,+ u) -ôr y ri <7] = eps (a—u) ME ore TE ig.: 


Combining these two expressions gives (3.1). 


Proof of Theorem 3.2(a) when 6 = 0 and «/(0) = 0. 
An easy continuity argument, letting ô | 0. We omit the details. 


Proof of Theorem 3.2(b). 

Consider again first the case «’'(0) > 0. It is obvious that W) may be modified 
by a multiplicative constant, so we redefine W)(u) as WO (u) = Py(X,, > 
0)/«' (0). 


Using integration by parts and noting that P(X. = 0) = 0, we have 


leo = Ee™sCž%) = 1- f se" P(-X ~ > u) du 
0 


II 
os 
8 
8 
| 
è 
= 
| 
>< 
A 
Ss 
Q 
Q 
II 


f se *“P, (X,, > 0)du. 
0 


On the other hand, ps — 0 as 6 | 0, more precisely ps ~ 6/x’(0). Letting 
ô | 0 in the second part of Lemma 3.4 therefore yields Ee’*2 = x/(0)s/k(s). 
Comparing these two expressions gives 


eS WO(u) du = ; 
J K(8) 

which is the desired conclusion for the case «’(0) > 0. The proofs for the re- 
maining cases are then easy by involving the connections between W, W,, and 
Ww), 
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Remark 3.5 There is a simple, but slightly heuristic way to see Theorem 3.2 for 
arbitrary drift and 6 > 0: Define C(u, a) = E, [eer a ae By the absence 
of upward jumps, X with Xo = u can only reach an arbitrary level b (with b > 
a > u) without ruin in between, if level a is passed before that. Consequently, 
by the strong Markov property of X one has C(u,b) = C(u,a)C (a,b), so that 
C(u,a) = C(u,b)/C(a,b) = h(u)/h(a) and one may identify h with the scale 
function. This argument shows that Theorem 3.2 is in fact valid beyond Lévy 
processes, namely for stationary Markov processes without upward jumps. 


We next turn to the proof of Theorem 3.3. The first step is to note: 
Lemma 3.6 There exists a measure W)(du) on [0,00) such that W®[0, u] = 


W)(u). This measure has Laplace transform 


i —suyq7(6) ayo S 
ja wO( e r (3.10) 


Proof. The first statement is clear since W)(u) is strictly increasing. The 
l.h.s. of (3.10) then comes out as 


oo o0 oo y 
i wau) f se dy = f se *Y ay f WwW) (du) 
0 u 0 0 


a sf e YW) (y) dy 
0 


which is the same as the r.h.s. 


Proof of Theorem 3.3(a). 
For 6 > 0, we have from Lemma 3.4 that 


| -> ô S ô X ro Xes 
~ pa kls)=8 K(s)-5 l 


n e [two (du) — SW ® (u) du 
0 Ps 


Le. 
P(-X,, € du) = Zw (au) — W ®) (u) du. 
ô 


It follows that 


biG, <o] = Paules > rg) By (XG, <0) 


= 1-P-X,, <u) = 148 f Way - WOW) 
0 pô 
= Z)(u) = 2 WE), 
Pô 


4Strictly speaking, we would need a right-continuous version. However, this turns out to 
be inessential for the following, and in fact W(®) (u) can be shown to be continuous. 
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The case 6 = 0 is again easy by taking limits. 
Proof of Theorem 3.3(b). 
We can write 


MG STO LT] = ME sTo; i < 00] — ME Sen Lig |E 
Since X,+ = a, the second E, is 
7 ort, H 7 OT). 
lee ee Ty | My tes Ty < oo]. 
Thus E,,[e~° ; r9 < 7+] becomes 
ZO(u) — Å W(x) = LAKONO = > Wa) 
ps W (a) ps ' 


which is the asserted expression. 


Notes and references Scale functions are a classical tool for spectrally one-sided 
Lévy processes with roots in Zolotarev [922], Takacs [827] and Korolyuk [554]. Parts 
of the exposition above are close to Kyprianou [564]. For a recent survey of available 
explicit forms of scale functions and methods how to construct them see Hubalek & 
Kyprianou [482] and also Kyprianou & Rivero [567]. The argument of Remark 3.5 can 
be found in Gerber, Lin & Yang [407]. 

As for ruin probabilities themselves, one can naturally also use the generator ap- 
proach to identify the scale function, leading to an integro-differential equation and 
subsequently to a Volterra integral equation. The connection between this approach 
and the more standard one pursued here is highlighted in Biffis & Kyprianou [165]. 

For an extension of the above analysis to one- and two-sided exit problems with 
non-constant boundaries see Bertoin, Doney & Maller [159]. Extensions to Lévy pro- 
cesses that are reflected at the supremum or infimum are worked out by Zhou [920]. 
Loeffen & Patie [605] give a fine analysis of one- and two-sided exit problems with 
interest rates and absolute ruin for the case when the aggregate claim process is a 
subordinator. 

Lemma 3.4 can be exploited to design efficient numerical procedures for determin- 
ing finite-time ruin probabilities y(u, t); for an application in credit risk see e.g. Madan 
& Schoutens [622]. 


4 Further topics 


This section gives a brief overview of some topics which are basic in fluctuation 
theory for Lévy processes but more advanced than what we have looked at so 
far. The treatment should basically be seen as a heuristical introduction (to be 
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followed up by the interested reader by more detailed and rigorous treatments 
such as Kyprianou [564]). Thus the ‘proofs’ we present should mainly be con- 
sidered as heuristical motivations that the results are true (in fact, the theory 
is so advanced that even [564] has to skip certain steps). 

The topics under consideration are certainly relevant for ruin theory. How- 
ever, one problem is that explicit results beyond what we have already presented 
are rarely available. 


4a Local time at the maximum 


In the following, denote by X; = SUPo<s<+ Xt the running maximum. A non- 
decreasing process {L,} with D-paths and Lp = 0 is called a (version of) the 
local time at the maximum if 


(i) The support of the measure dL; is the closure of the set {X, = X;}; 
(ii) For every stopping time 7 such that X, = X, on {r < oo}, the shifted 
trivariate process 


{Xr4t TS Xr, Xr+ = Xr+t, Lry io Le} is6 
is independent of F, on {T < oo} and has the same distribution as 


{ Xz, Xt _ Xt, Lees: 

Note that this definition identifies L only up to a multiplicative constant, and 

that existence is not a priori obvious. Note also that the term ‘local time’ occurs 

in various different, though often related, meanings in the probability literature. 
For some Lévy processes an obvious candidate for L easily suggests itself and 

the verification that it indeed is a local time is straightforward. In particular: 

(a) If X is spectrally negative, one can take L; = X+. 

(b) For a compound Poisson process with positive drift and negative jumps, the 

set {X t= Xe} is a union of disjoint intervals, and one may take 


t 
L, = af I(X, = X,) ds 
0 


with a > 0 arbitrary. In particular, this covers the reserve process in the Cramér- 
Lundberg model. 

(c) If the set of times of maxima of X is discrete (as for the claim surplus process 
of the Cramér-Lundberg model), one may take L; = M; where M; is the number 
of maxima before t. 


The intuition behind the definition of local time at the maximum is to give 
an indication of how much time X has spent at its running maximum before 
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t. For this reason, none of the definitions in (a), (b), (c) are applicable in the 
whole class of Lévy processes. More precisely, the definition L; = X, would 
not have the required intuitive properties for (say) a compound Poisson process 
without drift, and the definitions in (b), (c) would not be appropriate for (say) 
Brownian motion because { X; = X; } is a Lebesgue null set, excluding (b), and 
not discrete, excluding (c). 

The general definition of L requires the notion of regularity. We say that 
B (say B = (0,00) or B = [0,0o)) is regular if P(rg = 0) = 1 where tg = 
inf {t > 0: X, € B} (note that by Blumenthal’s 0-1 law, P(rg = 0) is either 
0 or 1). For example, (0,00) is regular for Brownian motion, but (0,00) and 
(0, co) are both irregular for the Cramér-Lundberg claim surplus process. There 
are then the following three cases: 
1) X has bounded variation and [0, co) is irregular. Then the set of maxima is 
discrete and we can take L; = pees E, where M, is as in (c) and the EF) ; are 
i.i.d. exponential(A).° 
2) X has bounded variation and (—oo,0) is irregular. One may define L as in 
(b) above. If X is spectrally negative, then L is proportional to X. 
3) X has unbounded variation (this can be shown to imply that [0, 00) is regular). 
A local time exists, but there is no simple known expression in terms of the path 
of X. Again, if X is spectrally negative, then L is proportional to X. 


4b The ladder height process 


First note that Do. = lim: Lt may be finite (the main case is negative drift). 


Then define 
Ee ares at t < Loo 
ee o0 tas 


Further let the ladder height process be 


Xp t< Læ 


M = { co t> L% 


Theorem 4.1 The process Y = A dig Mt) } 50 is a bivariate Lévy process, 
possibly terminating if Læ < œ. ~ 


Proof. The definition of (Ly a M;) immediately implies that X is at a maximum 
at time L7. Therefore 


Yis- Yi = (Lp), — Ly", Mips — Mi) 


5The variation from (c) is motivated from the desire of Le to have certain properties, cf. 
the following Section 4b. 
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has the same distribution as Y, = (i Ms) and is independent of {Yy }u<t. 
This implies that Y has stationary independent increments and the assertion. 


It follows that for some suitable function ¢(-,-) we can write 


log Eexp{—aL;' —bM,} = —¢(a,b)t. (4.1) 


4c Excursions 


By an excursion from the maximum we understand a segment {Xt }u<t<v of X 
such that X,, = Xu < X, and X; < X, foru<t<v. 

The fundamental fact about excursions is that they roughly occur according 
to a Poisson process in the time scale given by Lt. However, for example 
for X a Brownian motion and s a time where X is at a maximum, sample 
path properties of X imply that each interval [s,s + e| contains infinitely many 
excursions. Of course, the sum of their lengths has to be finite, so for each 6 > 0 
there must be finitely many excursions of length > 6 and infinitely many of 
length < ô. The same phenomenon typically occurs for general Lévy processes, 
so a careful formulation of the Poisson property is needed, for example the 
following one: 


Theorem 4.2 Let ô > 0 and let mı < no < ... be the times > 0 where an ex- 
cursion of length > ô starts. Then the points Ly, Lya,- form a homogeneous 
Poisson process. 


Proof. We only treat the case of L, being continuous in t. For brevity, denote 
the excursions of length > 6 as 6-excursions. The counting process N of 6- 
excursions on the L~!-scale is given by N; = max{i: m < s} where L, = t. 
Let tı < tg < t3 <---. Then in the ordinary time scale for X, Do Lp ia. 
correspond to times sı < sg < s3 with Ls; = t; and X is at a maximum at each 
si. It is clear that the number N, — Na, of 7; € [s1, s2] is independent of the 
number Ni, — Ni, of 7; € [s2, s3], and similarly for further intervals of the same 
type. Further, it is not difficult to see by considering a sample path that the 
distribution of Ni, — Nz, only depends on tz — tı. Also, if a -excursion starts 
and ends at say u,v, then X is at a maximum at v so the local time has to 
increase at v which implies that the local time at the time w (say) where the 
next d-excursion starts satisfies Luy > L, = Lu. This implies that N cannot 
have multiple points, which together with the already noted fact of stationary 
and independent increments implies the Poisson property. 


6The characteristic ‘length > 6’ could be replaced by many others, for example that the 
maximal deviation from the maximum during the excursion exceeds 6. 
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The intensity parameter of N or similar point processes of excursions is in 
general not available. A notable exception is It6’s excursion law for Brownian 
motion (e.g. Rogers & Williams [744, Sec. VI.8]), where a complete description of 
the probability mechanism governing Brownian excursions is possible. Another 
one is the reserve process R+ of the Cramér-Lundberg model, where excursions 
occur at intensity @ and have the distribution of the busy period in the dual 
M/G/1 queue (see also Theorem III.2.3). 


4d The Wiener-Hopf factorization 


The Wiener-Hopf factorization occurs in many alternative forms in the litera- 
ture, but its currently most used version is the following one. Recall that es is 
an independent exponential(d) time. Further, define 


G, = sup{s < t: X,=Xs}, G, = sup{s <t: X¥,=X,} 
(the times of the last maximum, resp. minimum, before t). 


Theorem 4.3 (i) The pairs (Ge, Xeṣ) and (es — Ges, Xes — Xes) are indepen- 
dent. Therefore 
ô 


a Cee Wt (a,b) YT (a,b) (4.2) 


where 


W+ (a,b) = Eet@est’Xes | W- (a, b) = Eees Zes (4.3) 


(ii) The functions Yt, Y- in (4.3) can be identified (involving analytic continu- 
ation, if needed) via the function ġ in (4.1) and the corresponding one ¢ for the 
descending ladder process by means of 


vy 


RotGes HX e5 = __ (6,0) Peles tbX., — POO) =. (4.4) 


o7 . 


(6 — a, —b) ’ (6 — a, —b) 


The functions Y+, YT are called the Wiener-Hopf factors of X. They are obvi- 
ously at best given up to a multiplicative constant, but can in fact be shown to 
be unique modulo this. 


Proof. Given es > t, the distribution of es — t is again exponential(d). If t is 
the time of the last maximum before es, this changes the distribution to that 
of an exponential e% (which is an independent copy of es) given that an excur- 
sion away from the maximum occurs at time 0 and lasts at least e. However, 
independence pertains and obviously, Xes — Xe, is conditionally independent of 
(Gi, X+) and independent of t, which implies the claimed independence. 
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For (4.2), first note that 


=[ bee at er ®t dis retes tbXes i 


SoH 


Using the independence just established, this becomes 


HOX e; zelle Ges )+b(Xe5—Xes) = Ut (a,b) Teles Ges )+(Xes Xes) 


However, a sign reversion argument easily gives that 


(es — Ges , Xes — Xe) Lle K Ji 


es 


Hence the final expectation in (4.5) reduces to Y~ (a,b), completing the proof 
of (i). 


We do not give the proof of (ii), see for instance Kyprianou [564]. 


Example 4.4 For the spectrally negative case with positive drift, Ly! = r; 


and M; = t, so that from (4.1) and Lemma 3.1 it follows that (a,b) = pa +b 
(T < œ a.s. for positive drift). Hence 


Ut (a, b) = ps/(Ps—a = b), 


which for the special case a = 0 was already established in Lemma 3.4. From 
(4.2) we then easily identify 


6 P5—a — b 


{Eaa Sere Ny 


From this one can, in view of (4.3), read off d(a,b) = (a — K(b)) /(pa — b). 


4e A quintuple identity 
Consider the quintuple (Vi, V2, V3, V4, Vs) given by the r.v.’s 
ea eae ess T(x) — G,+(2)—> Xr+(£)—; Xr+ (2) 


(the time of the last maximum before first passage, the value of that maximum, 
the time from that maximum to first passage, the value just before first passage, 
and the value just after, see Figure XI.1). 
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FIGURE XI.1 


Further define the measures X, Y (often called potential measures) by 


UMds,dx) = | P(L7* € ds, M, € dz) dt, 
0 


Mds,dz) = J P(L; | € ds, M; € dz) dt, 
0 


where as before i 1 and M; refer to the corresponding quantities of the de- 
scending ladder height process. By Fubini’s Theorem and (4.1), the bivariate 
Laplace transform of WY has the simple form 
—as—ba a (a —aL, '—bM 1 
e "UW ds,dx) =| dt -E(e-*"« t) = , (4.6) 
Viet 0 ola, b) 

Remark 4.5 From (4.6), obviously Si0,00)2 Mds,dx) = 1/¢(0,0). With the 
definition U(dx) = shee Uds, dx) one then sees by normalization that 


pu) = P(r(u) < œ) = 6(0,0) U (u, œ). (4.7) 


This representation of the ruin probability can be interpreted as the continuous- 
time extension of the Pollaczeck-Khinchine formula of Theorem IV.2.1, because 
M; is the (ascending) ladder height process. 


Theorem 4.6 The conditional distribution of Vz, V4 given Vi, V2 depends only 
on V2, and the conditional distribution of Vs given Vi, V2, V3, V4 depends only on 
Va. Further, there exists a normalization of the local time such that the density 
of (Vi,x — Vo, V3," — V4, V5 — x) at (s,y,t,v,z) can be written as 


Mds, x — dy) dt, dv — y) v(dz + v). 
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Proof. The claims on the conditional distributions are clear from Figure XI.1 
and the strong Markov property. This gives a factorization of the density of 
(Vi, £ — V2, V3, 2 — Va, V5 — £) as hi (ds, 2 — dy)ho(dt, dv — y) h3(dz + v). Here 
it is clear that h3(dz + v) = v(dz + v). The claim on hj, hg will not be shown 
here (see e.g. again [564]). 


The quintuple law usually does not lead to explicit formulas. For spectrally 
positive Lévy processes {X;}, one can however obtain the following simpler 
expression for the joint density of the last maximum before first passage V2, the 
value just before passage V4 and the value after passage V5: 


Corollary 4.7 If {X;} is a spectrally positive Lévy process drifting to —co with 
scale function W (u) = W)(u), then the density of (x — V2, — Va, Vs — x) at 
(y,v,z) is given by 


W'(x — y)v(dz + v) dudy, 0<y<min(z,v), z>0. (4.8) 


Proof. We can use the expressions obtained in Example 4.4, but have to reverse 
the role of the ascending and descending ladder process, because now we have 
a spectrally positive process. From (4.6) with a = 0 we see that 


g —br Mal b 
f e MU (de) = oy (4.9) 


At the same time d(0, b) = b, so we can choose U (dz) = dz. But in view of 
Theorem 4.6 this implies that the density in (4.8) is 


U(x — dy) v(dz4+v). 


However, comparing with the definition (3.2) of the scale function W), it is 
clear that U can be identified with W, because the latter is only unique up 
to a constant (which can be controlled by the normalization of the local time of 
X at its supremum). 


Remark 4.8 If in addition {X+} has bounded variation, then from (1.6) 


k(—b) = -ab [tem — 1)v(dy) 


with E(X,) < 0 and one can infer from (4.9) by expanding the resulting geo- 
metric series that 


U(de) = — Xoxan) 
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where x(dx) = v(x,o0)dz/cı. Correspondingly, (4.8) can in this case also be 
written as 


1 Co 
= Dx" (a = dy) v(dz + v) dv, O<y<a,y<v,z>0. 
1 


n=0 


In terms of the risk reserve process R+, the above result gives a formula for 
the joint density of the surplus prior to ruin, the deficit at ruin and the size 
of the last minimum before ruin in terms of the scale function and the Lévy 
measure; see Chapter XII for a further discussion. 


Notes and references A complete proof of Theorem 4.6 is given in Doney & 
Kyprianou [326], where also asymptotics of the quintuple law for x — oo are given. 
Kuznetsov [563] recently gave quite general criteria under which the Wiener-Hopf 
factors are of semi-explicit form and identified a set of tractable special cases. 


5 The scale function for two-sided phase-type 
jumps 


In this section, we assume that {X;},.) is the superposition of a Brownian 
motion with drift u and variance constant o? > 0 and two compound Poisson 
processes, one having upward jumps at rate At and being phase-type with rep- 
resentation (E+,a+,T*) and the other having downward jumps at rate \~ and 
being phase-type with representation (E~,a~,T™ ) (the cardinalities of E+, E7 
are denoted by p*, resp. p~). That is, the Lévy exponent «(s) as defined by 
log Ees** /t equals 


suts?o?/2+r* (at (—sI-Tt) ttt -1) +r (a7 (sI -T7) tt -1). (5.1) 


It is well-defined in the strip 9 = {s € C: p7 < Rs < pt} where p* is the 
eigenvalue with largest real part of -T’* and p7 is the eigenvalue with smallest 
real part of T7. 


Theorem 5.1 Assume that there exist p = pt +p +2 distinct complex numbers 
sz such that k(sk) = 6. Define (s) = co (s) = 1 and cf(s) = el (—sI — 
Tt)-1tt, cf (s) = el (sI —T~)“1t-, and denote by by,.. Hot bI s.es bto 
be, bp the solutions to the p linear equations 
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Then 
pt 
ME ord rt < org] = Sor, (5.3) 
i=0 
Eu [e~°7% ip Ste || = b; (5.4) 
i=0 


Remark 5.2 The r-h.s. of (5.1) is well-defined not just for R(s) € (p~, pt), but 

for any s € C that is not an eigenvalue of —T* or T7, and is analytic in this 

domain. The roots są should be looked for in this entire domain, not just 2. 
Note that we can write 


at(—sf—Tt)“1#* = n*(s)/d*(s), œ~ (sI -T 7) tt = n` (s)/d (s), 


where (assuming minimal PH representations) n+(s), d (s), n~ (s), d~ (s) are 
polynomials of degree pt — 1, p+, p7 —1 and p`, respectively. Thus, the defining 
equation for the są can be written 


dd*(s)\d~(s) = sd*(s)d~(s)uw+s?d*(s)d7(s)o?/2 
+t (nt (s) — d*(s)d-(s)) + A~ (n~ (s) — d*(s)d-(s)). 


This is a polynomial equation of degree p = pt + p~ + 2, so that indeed p roots 
exist. For the question of the roots being distinct, see the Notes and References. 

If o? = 0, Æ 0, one only has to look for pt + p~ + 1 roots, and if o? = 
0, u Æ 0, only for pt + p~ roots. The modifications are obvious and will not be 
spelled out. 


Proof of Theorem 5.1. Let 0 < s < pt and Z = X: — 6t/s. The Kella-Whitt 
martingale then becomes M; = M;(s) where 


t $ 
Mils) = n(s) f e842» dv + e” — 0% = sf eZ don 
o 0 


Let T = T3 V7). Then 
IMi] < Kls) —dle%r +e + e+ 


for 0 < t < 7 where we can think of Vt as the possible overshoot over a. Since 
V+ is phase-type with phase generator T+, we have Ees@+Y") < oo. Also 
iT < oo, and it follows that sup,<, |M:| is integrable so that optional stopping 
at T is permissible. E 
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Let Bj be the event that a is upcrossed before 0 is downcrossed and that 
the upcrossing results from the Brownian motion and not a jump, and let Be 
be the event that a is upcrossed before 0 is downcrossed and that this results 
from a jump being in phase 7 at the upcrossing. Similarly, let By be the event 
that 0 is downcrossed before a is upcrossed and that the downcrossing results 
from the Brownian motion and not a jump, and let B; be the event that 0 is 
downcrossed before a is upcrossed and that this results from a jump being in 
phase i at the upcrossing. Write 


+_f bt . 7+ — Bt =r OT Ge at =. p= 
bf Sole ; ri <T Br], be SO ale Op Sn Be. 


Optional stopping at 7 now gives 


Mọ = e“ = E,M, = (x(s) 8) f eos dy — Ee? +8*, (5.5) 
0 
Given BS , the overshoot over a equals 0 for i = 0 and is phase-type with 


representation (E+,e],T*) for i > 0. A similar argument applies to B; , and 
so expanding the r.h.s. of (5.5), we obtain 


et = (n(s)—8) [et av — Ehet -Serer 66) 
0 i=0 i=0 


Now note that 


0< u efe du < E,re* (5.7) 
0 


for any s € C. This readily implies that E, [{-} is an analytic function defined 
for all s € C. It therefore follows by analytic continuation that 


T qt q 
e“ = (k(s)— | e°% dy — S > cf (seat — X a (s)bF 
i=0 i=0 


for all s ¢ 2. In particular, taking s = sz this becomes (5.2). Formulas (5.3) 
and (5.4) are then clear. 


Notes and references Theorem 3.3 occurs in Asmussen, Avram & Pistorius [70], 
though with a somewhat different proof. Further references on fluctuation theory for 
Lévy processes under phase-type assumptions include Pistorius [704], Dieker [323] and 
a series of papers by Mordecki and co-authors, e.g. Lewis & Mordecki [582]. A phase- 
type approximation for CGMY Lévy processes with applications for the pricing of 
equity default swaps is given in Asmussen, Madan & Pistorius [90]. 

In practice, the roots są will more or less always be found to be distinct. In the 
rare cases with one or more roots having multiplicity > 1, modifications are needed. 
For an example of how this can be done, see D’Auria et al. [276]. 
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Chapter XII 


Gerber-Shiu functions 


1 Introduction 


At some places in previous chapters we have seen results on the time of ruin 
T(u), the deficit at ruin €(u) = |R,(,)| and the surplus prior to ruin R,(,)-. 
In this chapter we will study a combination of these quantities simultaneously, 
which leads to a tractable and elegant treatment. This combination is of the 
form of an expected discounted penalty at ruin 


m(u) = se 874) w(Reuy-,€(u));7(u) < o], (1.1) 


where the penalty w is a non-negative function of the surplus prior to ruin and 
the deficit at ruin. The expression m(u) is usually referred to as the Gerber-Shiu 
function. Clearly, for w = 1 and 6 = 0, (1.1) reduces to the ruin probability 
w(u), and for w = 1 and 6 > 0 one arrives (with a slight abuse of terminology) 
at the Laplace transform of the time to ruin 7(u). Alternatively, if 6 = 0 and 
w is the bivariate Dirac-delta function, (1.1) represents the joint density of the 
surplus prior to ruin and the deficit at ruin. The parameter 6 > 0 can be 
interpreted both as a discount rate and the Laplace transform argument. The 
refinement to include time-dependence in the analysis is a natural step towards 
a better understanding of the behavior of the risk process. 

If f(x,y, t|u) denotes the (defective) joint density of surplus prior to ruin, 
deficit at ruin and time of ruin given that Ro = u, then the finite-time ruin 
probability can be expressed as 


wut) = | ( [OL tevs ary) as, 
35 


7 
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from which an integration by parts and the representation 


m(u) = Lf js w(x, y) f(x,y, tu) dt dx dy 


yields that for w=1 


ee; r(u) <oo] = fe uuta = 6 f etuudet. (0.2) 


Thus the Gerber-Shiu function also contains as a special case the Laplace trans- 
form (w.r.t. time) of the finite-time ruin probability (or, equivalently, the ruin 
probability up to a random exponential time horizon with parameter 6). Other 
choices of the penalty w lead to interpretations of m(u) as the expected present 
value of deferred continuous annuities during the first negative excursion of Ry 
or the price of a perpetual American put option on an asset with dynamics given 
by {R;} as well as the price of reset guarantees for mutual funds (see e.g. Gerber 
& Shiu [410]). 

Define the discounted (defective) joint density of surplus prior to ruin and 
deficit at ruin as 


fle.yla) = | o-** f(r, y, tu) dt 


and the discounted density of the surplus prior to ruin as 


flalu) = f Ready: 


Let us start with some general considerations for renewal risk models. Unless 
stated otherwise, we will always assume a positive safety loading 7 > 0. Al- 
though not always necessary, we usually assume that the claim size distribution 
B has a density b. With the notation w(x) = f w(x, y — x) B(dy), an alterna- 
tive representation of the Gerber-Shiu function is 


a T enoo e aa l aa e 
m= | f venten E aa = f oo a. 


It is now easy to derive a defective renewal equation for m(u) that holds for 
general (zero-delayed) renewal models. Conditioning on the first time that the 
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surplus falls below the initial level u and the size of this jump, one has 


a 
+f is he w(x + u, y — u) f(x, y, t|0) dt da dy 
[ f ) f(x, yl0) da dy 


f T w(x +u, y — u) f(x, y|0) dx dy. (1.3) 


II 


m(u) 


II 


Denoting with 
= f e.yl0) ax (1.4) 
0 


the defective discounted density of deficit at ruin when u = 0, the defective 
renewal equation can be written as m = m * gs + h for the function h specified 
n (1.3). This equation will be useful at a number of places later on. A first 


consequence is 
y= f° [w (x, y|0) da dy. (1.5) 


Throughout the chapter, we tacitly assume 


ee w(x, y)b(x + y) da dy < o0, (1.6) 


which is a natural condition to ensure that m(u) is finite for all u > 0. In view 
of 7 > 0, we will also assume the natural boundary condition 
lim m(u) =0 (1.7) 
uUu— co 


(which in many cases is automatically fulfilled under additional assumptions on 
the interplay between the penalty w and the claim size distribution B). 


Notes and references The investigation of extensions of ruin probabilities has a 
long history, see e.g. Segerdahl [791], Siegmund [806], Gerber, Goovaerts & Kaas [405] 
and Dickson [305]. The definition of m(u) and the derivation of many of its properties 
goes back to Gerber & Shiu [408, 409]. Since then, this topic has experienced an 
enormous interest and activity. In a diffusion set-up, another function involving the 
time value of ruin was discussed in Powers [710] under the name expected discounted 
cost of insolvency. 

In the literature, often additional conditions on the penalty function w like bound- 
edness and continuity are imposed, which for instance ensure absolute continuity of m. 
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However, with some effort (and sometimes at the expense of regularity properties of m) 
these assumptions can usually be relaxed to condition (1.6). To avoid a too technical 
exposition, we will therefore not always be precise on the conditions for w, with the 
implicit understanding that for the respective proof method w is chosen appropriately 
and subsequently this choice can be relaxed (for a detailed discussion see e.g. Schmidli 
[780]). 


2 The compound Poisson model 
If {R,} is the classical Cramér-Lundberg process, one can derive an IDE for m, 
for instance via the following direct argument: 


Let h > 0. By conditioning on the time and amount of the first jump before 
time h (if there is such a jump), we get 


y= f ” ge- Etat ( i AES 


+f w(u + t,x — u — t)B(dx)) + e B+) hin(u + h). 
utt 


We differentiate this equation with respect to h and set h = 0 in the resulting 
equation.' This yields 


af m(u—2) B(dz) +a f w(u, x—u)B(dx)—(86+ô)m(u)+m' (u) = 0. (2.1) 


Under the boundary condition lim,—.. m(u) = 0, equation (2.1) has a unique 
solution. 

The equation discussed in the following lemma will turn out to be of crucial 
importance throughout the whole section. 


Lemma 2.1 /f Bir] exists for an r > 0 and is steep (cf. p.91) and ô > 0, then, 
within the domain of B|r], the Lundberg fundamental equation 


k(r) = 6(Blr])-1)-r = ô (2.2) 
has one positive root ys > 0 and one negative root —p5 < 0. 
Proof. The only difference to equation IV.(5.2) is that now Bir] = 146/6+r/8, 
so the result is obvious by the convexity of B[r] (see Figure XII.1). Note that 
YW > Yo= 7: 


1Note that the differentiability is again guaranteed by the same argument as in Remark 
VIII.1.11. Other ways to derive equation (2.1) include the (essentially equivalent) generator 
approach (cf. Chapter II) and the method given in Section 3c. 


2. THE COMPOUND POISSON MODEL 361 


FIGURE XII.1 


2a A Laplace transform approach 


In view of the convolution term in the integro-differential equation (2.1), the 
analysis becomes particularly transparent with Laplace transforms. Let 


and G[-s] = fe [7 w(u, 2 — u) B(dx) du. Then taking Laplace trans- 
forms in 2 1) ieads to 
k(—s)— ô 


y (1.7), m(u) is bounded in u, so its Laplace transform m[—s] must be an 
analytic function for (at least) R(s) > 0 and hence the positive zero s = ps of 
the denominator must also be a zero of the numerator. In this way, one obtains 
by purely analytic arguments the identity 


m(0) = b@l-ps] = 8 J Sia w(x, y) b(a + y) dyda. (2.3) 
+= 0 
From this we arrive at 


b@l-ps] - Os) © 
s — ô -— b + BB[-s| 


m[—s] = (2.4) 


362 CHAPTER XII. GERBER-SHIU FUNCTIONS 


Remark 2.2 Equation (2.3) contains surprisingly explicit information: if one 
chooses for w(x, y) the Dirac-delta function for the second argument, one ob- 
tains the discounted probability density function of the deficit at ruin for initial 
surplus zero 


ow) = 0 f ee +y)de, y>0, (2.5) 


which provides an alternative proof of Lemma V.(3.2). 

On the other hand, the choice w(x) = 1 (and using @[—ps] = (1—B[—ps])/ps5 
and «(—ps5) = 6) leads to 
ô 


ae PT (0), y co} = 1-— : 
[ :7(0) <œ] = 1-7 (2.6) 


which already appeared in Corollary V.3.4 with a related, but somewhat differ- 
ent proof. 

In view of (1.2), (2.6) implies that the Laplace transform of the finite-time 
survival probability 4(0,¢) = 1 — w(0,t) w.r.t. t is simply given by 


ae e *'g(0,t) dt = 1/ps. 


0 


Remark 2.3 Note that for 6 — 0, under the net profit condition 7 > 0, we have 
ps 0. An application of L’Hépital’s rule in (2.2) then gives 6/ps > 1 — Gup, 
and so the formulas in Remark 2.2 correspondingly simplify to goly) = GB(y) 
(which is the ladder height density of the compound Poisson process as derived 
in IV.2) and (2.6) reduces to w(0) = Bup. 

Similarly, for 6 = 0 equation (2.4) simplifies to 


al - PCW -al-s 
If further w = 1, this gives 
gis - Sean aa BCs) sd mee (2.8) 


which is another way to write the Pollaczeck-Khinchine formula IV.(2.2). 


This link to the classical model can be used to obtain some further nice 
identities in a simple way: 
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Proposition 2.4 The defective (non-discounted) density of the surplus prior to 
ruin in the compound Poisson risk model with initial capital u is given by 


fo(lu) = Fon = o(u)) = He <w)B(2)(1- plu- 2))) (2.9) 
and the defective (non-discounted) density of the deficit at ruin is 
fo(ylu) = apg (BON Wu) i “(1=U(u=2)) bla-+y) de) . (2.10) 


Proof. Replacing the denominator in (2.7) by (2.8) gives, after inverting the 
Laplace transform, 


m) = 2 — (20)(1-ww)- | 


1- uB 0 


If we now choose w(x,y) = e7%, ie. G[-s] = (1— Bl-a—s])/(a+s), then 
this leads to 


uU Co 


(1-v(u 2) | w(x, y—2) B(dy) de) . 


x 


Eet =w; 7(u) < oo] 
-Ê € Se 
1-—ßug a 

and the inverse Laplace transform w.r.t. a is (2.9). On the other hand, the 


A 


choice w(x, y) = e™%, ie. B[—s] = (B[-a] — B[-s}) /(s — a), gives 


a) g -a2 Bp r 
(1 ww) — | (1-4-2) Bz) dr), 


sfe agl). 7 (yu) < oœ] = 


A 


Fs (a - 5m) - fr -vu- 2 [oe Bade) 


1—6yup a 


and its inverse Laplace transform is (2.10).? 


Taking derivatives at a = 0 of the above Laplace transforms now leads to 
the following identities: 


Corollary 2.5 The moments E|(R,(u)—)” | T(u) < co] of the surplus prior to 


B pore — v(u)) a E 
1yo = n+l f x (1- W(u=2)) Bade), 


2Note that from these expressions one can again observe the somewhat curious fact that 
in the compound Poisson model for u = 0 the distributions of the surplus prior to ruin and of 
the deficit at ruin coincide, see also Theorem IV.2.2. 
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and the moments | E(u) ” Ege T(u) < 00] of the deficit at ruin are given by 


(n+1) P= ai 
amit (= u ) f (1 y(u ») f (v-1)"B(y) az) 


Proposition 2.6 Let uË < œ and define nlu) = E[r(u)”; T(u) < œ] for 


n € No. Then forn > 1, Yn(u) is given by 


lf ve- oa Jay + f aw )dy — y(u ) [only ) dy). 


In particular, 


B (2) 
Sov y) dy + JE ply ) dy = a Ag w(u) 


(1 — Bus) v(u) 
Proof. For w = 1, vn (s) = Č Ral sl is the Laplace transform of (—1)” Yn (u) 


w.r.t. u. Consequently, differentiation w.r.t. 6 of 
((—s) — 6) fis[—s] = m(0) — al-s] 


and choosing 6 = 0 gives 


s[r(u)| T(u) < o0] = 


0" m(0) 


K(—8) Un(s) = an + nUn—1(8). 


T 


Since vn (s) is an analytic function for R(s) > 0 and s = 0 is the only zero of 
k(—s) in the right halfplane, it follows that 


ô” m(0) | 
oð” ls=0 


= =n lim vn- 1( Dn f Yn— 1( 


Hence, together with (2.8), we obtain 


vals) = 1 ((-1" | bn-1(0) dy + tn-1(8)) Tae 


1 — ug 
from which the desired formula for w,,(u) follows immediately. sre the 
formula for n = 1 follows from yọ(x) = w(x) and the identity Shae: x) dx = 


buD/[20 — Bup)| (which is itself a direct consequence of (2.8) for s > 0; see 
also IV.(3.7)). 


Let us now use equation (2.4) to derive another representation of the defec- 
tive renewal equation for m(u): 
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Proposition 2.7 The Gerber-Shiu function m(u) in the compound Poisson 
model satisfies the defective renewal equation 


a) = E | “HOD aa ENO; (2.11) 


where the (proper) density gp(y) is given by 


w) = yg, Me? Beda), vzo (2.12) 
and a = 
h(u) = ef a. w(x,y — x) B(dy) dz. (2.13) 


Proof. Replacing ô+ 3 in the denominator of (2.4) by ps + 6B|-ps] (which 
holds because «(—ps) = 6), one gets 


s — ps — 6 Bl-ps] + BB[-s] 1— e l 
(2.14) 


which is of the form (2.11) for 


@[—ps| — &[-s] 
5 — Pê 


al-s] = 8 


and 


aga A Bleed - Blea 
S 1—6/ps 8 — p 
Taking the inverse Laplace transform of the latter two quantities then gives the 
assertion. 


As a by-product, in view of (1.3) and (1.4) this again leads to 


E E 8 f “etori y>0, (2.18) 


which is (2.5). 
At the same time, with w = 1 it follows from Proposition 2.7 (or directly 


nan 


from @[—s] = (1 — B[-s])/s in (2.4)) that 


oe SUB |e br(uU). fu oo] du = K(—s)/s — ô/ ps 
f fe“; rlu) < 00] du = PREIA, 
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which already appeared in Corollary V.3.5. 


A related consequence of Proposition 2.7 is the asymptotic behavior of m(u) 
for subexponential claim sizes: 


Theorem 2.8 Assume that w = 1 and B € Z with finite mean. Then, for 
6>0 
p 


m(u) ~ J B(u) as u> oo. 
Proof. For w = 1 we have @[—s] = (1- B[-s])/s, and hence (again exploiting 
k(—ps) = 0) the expression hj- s] in the proof of Proposition 2.7 simplifies to 


ls] = (1 6/p5) HI). 


But this implies that for all s > 0 


z (1—8/ps) =] _ 1- 8/ps = 

m[—s] = = = 5/ps)” g 
= Te aaa an agg yt oa Ba 

where the geometric series converges since both m(0) = 1 — 6/ps < 1 and 


Gp[—s] < 1 (gp is a probability density). Taking the inverse Laplace transform 
now gives a representation of m(u) as the geometric compound tail 


2 y= 4 3 (1- Zy GFF (u), (2.16) 


n=1 


where G, is the c.d.f. of the density g, in (2.12). Its tail is 


Gly) = cam for Bet vee 
B 5 r -ps2 B 
= Bw f e B(de + y)) 
= Z BW) (1+0(1)), (2.17) 


where the last step follows from B € Z and Proposition X.1.5. Corollary X.1.10 
now implies G, € Z. As in Lemma X.2.2, one obtains from (2.16) by dominated 


convergence that 
m(u) 1—0/ps5 


u— oo Gp(u) ô/ ps 
and the result finally follows from (2.17). 
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Remark 2.9 Since for 6 = 0, m(u) = y(u) for w = 1, a comparison of the 
above result with Theorem X.2.1 shows that for subexponential claim sizes the 
introduction of the discount rate ô > 0 moves the asymptotic behavior of m(u) 
away from the magnitude of the integrated tail Bo to the one of the tail B (for 
ô — 0 one has 1 — 6/ps — Bye and the density goly) in the defective renewal 
equation is correspondingly replaced by B(y)/p, cf. TV.(3.2)). 

Since for general penalty functions the representation of m(u) as a compound 
geometric tail is usually not available, one needs slightly different methods to 
establish corresponding asymptotic results. We shall not pursue this further; 
the interested reader is referred to Tang & Wei [834] for details. 


2b Change of measure 


As in Section IV.1, consider the Wald martingale L; = exp {rS — K(r)t} as the 
likelihood ratio process. Then we have by a change of measure that 


m(u) = E, Je AOTRE eH OT) y (R DE „Elu )); T(u) < oo]. 
If the Lundberg coefficient ys > 0 exists, then P}, (T(u) < co) = 1 and hence 


m(u) = Ey [7 Se w(Ry(uyy-s€(u))] = En [7% E w (Regus E)E. 


Note that under the new measure P,,, not only the event of ruin is certain, but 
also the time-dependence of the penalty function has disappeared (or, rather, 
hides in the value of ys). If the penalty w is bounded, then a Lundberg-type 
inequality of the form m(u) < sup, y w(x, y)e~ 7%" immediately follows. 

The next result gives the asymptotic behavior for general continuous penalty 
functions: 


Proposition 2.10 Assume that the penalty function w is continuous. If ys > 0 
exists, then 


lim m(uje™™ = Cs = BI jZ w(z, x SABU (e737 — e757) dz l 
u— oo BB’ [ys] —] 


Proof. Define m(u) = m(u)e%" = Ey, [e7% E0 w(R,()-,€(u))] and denote 
by H(x,y) = Py,[R-(o)- < £,€(0) < y] the joint distribution of the surplus 
prior to ruin and the deficit at ruin under the tilted measure given that the risk 
process starts in u = 0. Then, just as in (1.3) but now under the new measure, 
it immediately follows that 


j= fa m(u— (co, dy) + +f / w(a+u,y—uje% YY) H(da, dy). 
y 


=u J x£=0 
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This is a (now proper) renewal equation for m(u) and according to Proposition 
A1.1 we need to show that the second summand above is directly Riemann 
integrable. Since w is continuous, it is enough to show that there is a directly 
Riemann integrable upper bound. Since ys > 0 exists, all moments of the claim 
size distribution (and consequently of the surplus prior to ruin and the deficit 
at ruin) exist and in view of (1.6) it is then enough to show that 1 — H(oo, y) 
is directly Riemann integrable, but the latter follows from the existence of all 
moments of the claim size distribution B. Applying now Proposition A1.1, it 
just remains to calculate the limiting constant 


O SE aludu Ko Ka Keo VE + uy — wen) H (dz, dy) du 
HF i ea (Q-H (00, y)) dy 


(2.18) 
Following the idea of Remark IV.5.6, the simplest way to identify its value is 
from Cs = lims—o s M|—s + ys] and expression (2.4). However, we shall here 
directly evaluate (2.18): Recall from Theorem IV.2.2 that under the original 
measure 


x z+y 
PR toy < x, (0) < y,T(0) <œ] = af ‘a B(dv) dz. (2.19) 


Under P,,, the risk process is again compound Poisson with By, = 8 B [ys] and 
zx 


95 
B, (dx) = Ž B(dz), so the safety loading is negative. Hence we need a 


Blys] 


further exponential tilting by the factor — (ps + ys) to obtain a classical com- 
pound Poisson process Rj with positive safety loading, claim distribution B* 
and Poisson parameter 8*, for which we can apply (2.19). This leads to 


II 


H(x,y) b[eOe tens"): RE. gy < x, £ (0) < y, T* (0) < 00] 


= sf deine f B*(z + dv)dz 
= x | retorted fo estes)” B* (dv) dz. 
0 z 


Since 8* and B* are related to 8 and B through exponential tilting by —p5, we 
finally arrive at 


x Z+y 
A(a,y) = ef iui e% B(dv) dz. 
0 z 
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The denominator of (2.18) then is, by changing the order of integration, 


| (1 — H(0v, y)) dy = of a a e7” dB(v)e =s+ps)z dz dy 
0 ytz 


A Blys] — B|- æl) BB'lys] — 1 


B'\ys5 
al hl Yë + Ps 5 + Ps 


Similarly, the numerator of (2.18) simplifies to 


ef J J w(x + u, y —uje "+" B(x + dy) da du 
u=0J/ y=ud x=0 


which finally leads to the assertion. 


Remark 2.11 Note that for w = 1 the constant simplifies to 


ô 1 1 


2c Martingales 


It is easy to see that the stochastic process {e78 Re on 


its natural filtration ¥ if and only if r = ys > 0 or r = —ps <0. This can be 
exploited in various ways. 


is a martingale w.r.t. 


Proposition 2.12 If ys > 0 exists, then 


ife Sr(u)tyas (u). T(u) < oo] =e 8", 6>0,u>0. 


Proof. The martingale {e~%!~% Re aor is bounded by 1 for 0 < t < T(u). Hence 
we can apply the optional sampling theorem for the stopping time 7(u) to obtain 


ile ôT(u) Rew] = eeu 


Due to limu—œ Ry = œ a.s., one has p [e78 Rr; (u) = 00] = 0 for 
ô > 0 and the result follows. 


Note that for 6 = 0, E[e%™; r(u) < co] = e77¥, in line with IL.(3.1). 


Exploiting the martingale for r = —ps leads to the following result (which is 
Lemma V.3.1, but we state it here again for completeness). 
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Proposition 2.13 Let 74 =min{t>0: Ri > a| Ro =u} fora>u. Then 


[era] = e7, FS Oa >u. (2.21) 


Proof. For fixed a > u, the martingale eho teR Rik on is bounded by e?“ for 
0 <t< rł. Hence we can apply the optional sampling theorem for the stopping 
time 7,° to obtain je TÀ +782} = Ps", 

Define ys(u) = Efer) +e: R- (u); t(u) < oo]. Let To = min{t > r(u)|R: = 
0} be the time of recovery after ruin. Since (2.21) holds for arbitrary a,u € R, 
one has for a < b that E[e~®(t -74 )] 


Tt < Tt] = e7 è-a) and consequently 


z[e 6(To—T(u)) | r(u) < 00, Fra) = ease). 
This leads to 


b[e 2%; r(u) < oo] = Efe? De; tu) <o] = palu), 


which gives w5(u) the interpretation as the expected present value of a payment 
of 1 made at the time of recovery, if ruin occurs. 


Proposition 2.14 The discounted density of the surplus prior to ruin satisfies 


ef I(x > u) +e” wW5(u — x) I(x < u) — ypslu) 
1 — 4% (0) ' 


Atx =u, f(x|u) has a discontinuity of size f(x|0) e°” = 8 B(u). 


x>0. 


f(u) = f(20) 


Proof. We will use Laplace transforms. Since 75(u) is the Gerber-Shiu function 
with w(x, y) = e~?5¥ (and correspondingly @[—s] = (B[—ps] — B[-s])/(s 
ps) = Gs|—s]/G), it follows from (2.14) and 6&[—ps] = m(0) that 


2 0) — Gs[—s 
TEDE a(0) sl- _ 
(s — ps) (1 — Gs[-s]) 
Similarly to (1.3), by conditioning whether or not ruin occurs at the first time 


when the surplus falls below the initial value u, one can write down the renewal 
equation 


(2.22) 


f(a,ylu) = L f(a, ylu— z)gs(z) dz+ f(x — u, y + uļ0) I(x >u). (2.23) 
By (2.15) 


f(£— u, y + u0) = Be") W(x +y) = f(x, y0)". 
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Hence integrating the renewal equation w.r.t. y, we have 
Jel) = f Falu- agde + flel0)e%™ Ie > u). 
0 


The function ¢(u) defined through f(x|u) = f(x|0) ¢(u) then fulfills the renewal 
equation 


C(u) = f C(u = z)ga(2)dz + e?“ I(x > u), 
0 
so that its Laplace transform is given by 


zx 1 e(Ps—s)@ _ 1 
¢[-s] = 


1—gs[-s] ps—s 


The statement of the proposition is that this expression is also the Laplace 
transform (w.r.t. u) of the function 


C(u) = epu I(x > u) + eP Tyg (u — x) I(x < u) — ypslu) 


1 — Ys (0) 
Standard calculations show that 
Ea ool + elst] pss] (e0921) (545 + dal-s]) 
S| = [= 
í 1 = (0) 1 = (0) 


Substituting (2.22) into the latter equation gives 


pe el(Ps—s)@ _ 1 1 


C2[-s] = 


ps8  1—@sl—5]’ 


a 


which indeed coincides with ¢[—s]. 


Since ps = 0 for ô = 0, Yo(u) is the usual ruin probability y(u), and we 
obtain the (defective) non-discounted density of the surplus prior to ruin 
I(x >u)+v(u— zx) I(x <u) — y(u) 
1 — 7(0) 


; x>0, 


fo(z|u) = fo(zx|0) 


which is another way of writing (2.9). 


372 CHAPTER XII. GERBER-SHIU FUNCTIONS 


2d Further ruin-related quantities 


Somewhat surprisingly, it just turned out that the time of recovery To plays 
a crucial role for the surplus prior to ruin. A related natural question is to 
consider the maximum severity of ruin prior to recovery, i.e. the r.v. M(u) = 
sup{|R;| | T(u) < t < To}. Its distribution function (given that ruin occurs) 
turns out to have a strikingly simple form in terms of the ruin probability y(u): 


Proposition 2.15 For positive safety loading ņ > 0, 


plu) - put z) 
a(u) (1 — H(z) 


Proof. Given ruin occurs, the event M(u) < z happens if ruin occurs with some 
deficit y < z and if the reserve process does not fall below level —z from there 
on before it is positive again. The latter is equivalent to the event that a risk 
reserve process starting in z — y attains level z before ruin, which happens with 
probability (1 — (z — y))/(1 — W(z)). This gives 


ylu) 1- y(z- y) 
(u) 1- %(2) 


P(M(u) < z|T(u)< œ) = 


P(M(u) < z|T(u) < œ) = [ h dy. 


On the other hand, 


TETE f taw | howe- way, (2.24) 


because for a risk process starting at level u + z, ruin can only occur if the 
reserve falls below level z and the first integral gives the probability that ruin 
occurs directly then, whereas the second integral gives the probability that ruin 
occurs later. Equation (2.24) can be rewritten as 


I eE E f ACU SE TE Oe ee 


from which the result follows. 


If the reserve process recovers after ruin, it may again first become negative 
before it reaches the previous maximum of the process, possibly leading to a 
larger ruin severity before reaching again this running maximum. The maximum 
severity of the ruin excursion out of the running maximum can be studied by very 
simple means thanks to the duality with queueing models, leading to another 
simple formula in terms of the survival probability ¢(u) = 1 — y(u): 
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Proposition 2.16 Define M,(u) as the (absolute value of the) maximum sever- 
ity during the excursion out of the running maximum of R, that causes ruin. 
Then, for positive safety loading, 


S g'(w +z) (u) 
puta) dw) om 


P(M,(u) > z|7(u) < 00) 


Proof. Denote by F;,(u, z) the probability that for a reserve process starting in u, 
ruin occurs at the Ath excursion out of the running maximum and the maximum 
severity is below level —z. Recall from Theorem III.2.3 the close connection 
between the maximum workload Vmax of an M/G/1 queue and the survival 
probability ¢(u) of the compound Poisson risk process. If G(u) = P(Vmax < u), 
then 


k-1 k-1__ 
(u, z) aa Pa wef e Glu +v) a G(u+t+ z)dt, 


because each excursion out of the running maximum occurs at an exponential(8) 
distributed time and whenever the excursion does not lead to ruin it can be ex- 
cised from the process. Accordingly, the kth excursion occurs after an Erlang(k) 
distributed time and the previous ones are uniformly distributed over this in- 
terval and must not lead to ruin. Now the assertion follows by noting that 
P(M,(u) > z|T(u) < co) = Eg: Fk(u, z), some simple algebra and applica- 
tion of Theorem III.2.3. 


Notes and references There are several ways to derive the results of this sec- 
tion. The seminal paper of Gerber & Shiu [409] is a rich source of calculations in this 
context and much of the material presented in this section can be found there. The 
transparency of Laplace transforms in the analysis of Gerber-Shiu functions is appar- 
ent, see also Dufresne & Gerber [333], Gerber & Shiu [408] and Dickson [305, 308]. 
Parts of the exposition of Section 2a follow from Albrecher & Boxma [19]; see also 
Willmot & Lin [892] and Schmidli [773]. In Albrecher, Gerber & Yang [23] one can 
find a transparent approach to some of the derived formulas by just using rational 
functions. Starting with the defective renewal equation for m, Lin & Willmot [596] 
derive some of the above and further results via compound geometric tails. Compu- 
tational aspects of the calculation of ruin time moments are discussed in Drekic et 
al. [329, 330], see also Dermitzakis et al. [296]. The duration of negative surplus 
To — T(u) was studied by other techniques in Dickson & Egidio dos Reis [311]. For the 
time and area spent at negative surplus levels up to a fixed time T, see Loisel [607]. 
Borovkov, Palmowski & Boxma [190] give a detailed related analysis of such quantities 
in a queueing context. Pitts & Politis [708] use a functional approach to approximate 
the Gerber-Shiu function with the one from a ‘near’ claim distribution for which more 
explicit results exist. An algorithmic procedure to obtain moments of the ruin time 
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for discrete claim sizes in terms of generalized Appell polynomials was developed in 
Picard & Lefèvre [702]. 

The result of Theorem 2.8 is from Siaulys & Asanaviciute [803]. Generalizing a 
number of earlier results, Landriault & Willmot [573] use the Lagrange implicit func- 
tion theorem to determine an explicit expression for the trivariate distribution of the 
time to ruin, the deficit at ruin and the surplus prior to ruin. This expression contains 
an infinite series of the integral of convolutions of the claim size density. Extending an 
idea of Frey & Schmidt [373], Usabel [860] develops a recursive computational tech- 
nique to approximate the above trivariate distribution using its Taylor expansion in 
terms of the Poisson parameter @ around @ = 0. Tail bounds for this distribution 
obtained from the integral equation can be found in Psarrakos & Politis [720]. 

The Gerber-Shiu function in a compound Poisson model with interest is considered 
in Cai & Dickson [214], see also Yang & Zhang [901] and Wu, Wang & Zhang [897]. Cai 
[213] and Yuen & Wang [911] deal with stochastic interest, whereas Badescu, Drekic 
& Landriault [118] study these ruin-related quantities with a multi-step premium rule 
under a MAP arrival process. For absolute ruin and the inclusion of tax payments see 
Ming, Wang & Xao [646]. Albrecher, Hartinger & Tichy [27] study the Gerber-Shiu 
function under a time-dependent threshold model for the premium income. 

Using the renewal measure of the defective renewal sequence of the zero points of 
R+, calculations involving the maximum of the surplus process up to ruin, the last time 
the surplus process passes zero before going to infinity ultimately and the minimum of 
the surplus process up to that time are provided in Wu, Wang & Wei [896]. Proposition 
2.15 is from Picard [700], where also the quantity Ja |R:|dt (i.e. the area below zero 
until the time of recovery) is studied. For Proposition 2.16 and further related formulas 
see Albrecher, Borst, Boxma & Resing [17]. Baigger [124] studies general criteria under 
which ruin occurs only finitely often. 

The Kella-Whitt martingale and the martingale introduced in [84] are used in 
Frostig [376] to derive results about the time of ruin in the presence of a reflecting 
barrier. Extensions are of course possible in many directions, Cheung et al. [242] 
include for instance information on the surplus after the second-last claim before ruin. 

Note that all techniques presented for the renewal model in the next section are 
by definition directly applicable for the compound Poisson model as well. 


3 The renewal model 


In the following we consider the zero-delayed renewal model with interarrival 
time distribution A(t) and density a(t). If Tı is the epoch of the first claim, the 


standard renewal argument gives m(u) = E (e~°"m(u+T; — U1)). Le. m(u) 
is given by 


[E rao, f” murt- Bans f w(u+t, y—-u—t) B(dy)) dt. (3.1) 


+t 
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3a Change of measure 


Recall from Section VI.3a the imbedded random walk structure of the renewal 
model. I.e. if we only consider the discrete time points at which a claim just 
occurred the resulting discrete time process is a random walk and in particular 
Markovian. However, since in this chapter we want to keep information on the 
time to ruin and the surplus prior to ruin as well (which is both lost when only 
in the imbedded random walk view), this type of markovization of the process is 
too crude for the present purpose. An alternative and elegant way to markovize 
the process is to consider the random variable V; = Ty,+1 — t as an additional 
state variable, which is the time remaining until the next claim.? Define «(r) 
as the solution of ae 

Bir|A[-—r—«(r)] = 1, (3.2) 


for every r with B [r] < co. It is easy to show by properties of moment-generating 
functions that for r > 0 this solution «(r) exists, is unique, and that K(r) is a 
strictly convex function on the set where it exists. Also, «(0) = 0 and «’(0) < 0 
under the net profit condition. With some further effort, one can then show 
that 
Iy i Birjew ts eh Ste K(r)t 

is a martingale with respect to the filtration generated by {(R:,Vi)} (see e.g. 
(746, Th.11.5.2]). LZ, can now be used as a likelihood ratio process. Under 
the measure P,.[-] = E[Z;;-], the risk process R; remains a Sparre Andersen 
risk process with claim distribution B,(dy) = j-r — K(r)Je™’ B(dy) and the 
interclaim time distribution changed to A,(dt) = Birle~@+*®™)* A(dt). If r > 
argmin «(r), then the drift under the new measure is negative (—4’(r) < 0) and 
consequently P,.(7(u) < oo) = 1. 

Under the measure P,, the Gerber-Shiu function m(u) can be expressed as 


Al—r — (r)] Ex [el AOD ¥09 “PSH HD) ay Roca €(t)); T(t) < 00] - 


Since V,,) is the time to the next claim after ruin and hence independent of 
Fr(u)- and R,(,), this identity simplifies to 


m(u) = E, [eel wR u- €(u));7(u) < coe", 


As in the compound Poisson case, the time dependence disappears if K(1r) = ô. 


3This is called forward markovization and leads to some subtleties concerning the inter- 
pretation of the appropriate filtration. Alternatively, one could also use the time since the 
last claim (backward markovization) with a more intuitive appropriate filtration, but then the 
resulting equations are usually more cumbersome, see the References at the end of the section. 
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Proposition 3.1 Assume that the equation K(r) — 6 = 0 with K(r) defined in 
(3.2) has a positive solution ys > argmink(r). Then 


m(u) = Ey, [e 186) w( Rytu), Elu)) Je" (3.3) 


and for a continuous penalty function w 


lim e%“m(u) = Cs 


uUu— co 
for some constant Cs > 0. 


Proof. Expression (3.3) follows from r = ys and P,,(7(u) < œ) = 1. Now the 
same procedure as in the proof of Proposition 2.10 gives a renewal equation for 
e%“m(u) and establishes the asymptotic result. 


The constant C's is now more difficult to evaluate. We will give its form for 
a large class of interclaim time distributions A in Corollary 3.9. 

Formula (3.3) can be helpful in a number of situations. We give one partic- 
ular example: 


Corollary 3.2 In the renewal model, for exponential(v) claim sizes the Laplace 
transform of the time to ruin is given by 


dle õr(u). T(u) < oo] api ee 
v 


Proof. In this case ys > 0 clearly exists and is the solution of Al-15 -ô| = 
1—75/v. The lack-of-memory property implies P(€(u) > æ | r(u) < œ) = e7”*, 
and correspondingly P~, (£ (u) > x) =e (’-%)", The result then follows from 


(3.3) with w = 1. 


Remark 3.3 Note that Corollary 3.2 extends Theorem VI.2.2 with a quite 
different proof. 


3b A modified random walk 


Another way to remove the discounting is to interpret as(t) = e~°*a(t) in (3.1) as 
a new (now defective) interclaim time density for a non-discounted risk process. 
This leads to a modified imbedded random walk S5, = 5 (U; — Toi), k21, 
where Ts; has defective density as(t) and a point mass of size 1 — pa as(t)dt = 


A 


1 — A|—ô] at infinity. Consequently, sup; S5,, is finite with probability 1. By 
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definition, the ruin probability of this modified random walk is the Laplace 
transform of the ruin time of the original risk process R4, 


dle one) T(u) < oo] = P(sup S5,k > u) ‘ 
k 


and the discounted distribution of the deficit at ruin of R; is 


ile Oru). E(u) < y,T(u) < oo] = P(N5,7 < 00, 55,5, S ut y) ; 


where Ns, = inf{k : Ss, > u}. Hence for these penalty functions the calcula- 
tions are reduced to random walk techniques with defective increment distribu- 
tion (for which e.g. the Wiener-Hopf factorization can still be done in the same 
way). If the claim sizes are phase-type, this leads to the following generalization 
of Corollary 3.2 and also of Theorem IX.4.4: 


Theorem 3.4 Consider the renewal model with arbitrary interarrival distri- 
bution A and phase-type claim size distribution B with representation (œ, T). 
Denote with as, the minimal non-negative solution of 


Qs+ = a f eT +tas)tas(t)dt. 
0 


Then the Laplace transform of the ruin time is 


dle õr(u). T(u) < oo] = Ast elT + tas+)u e, 


and the discounted distribution of the deficit at ruin is given by 


ile ork), E(u) > y,T(u) < oo] = agp eT + tes+)u eTe. 


Proof. The proof is a straightforward extension of the one of Theorem IX.4.4; 
see also Ren [735]. 


3c Integro-differential equations 


If the interarrival time density a(t) has rational Laplace transform, the integral 
equation (3.1) can be transformed into an integro-differential equation (IDE). 
For that purpose assume that a(t) satisfies an nth order linear differential equa- 
tion with constant coefficients, written in operator notation as 


pa(<)alt) = 0, (3.4) 


with the polynomial 


palt) = 2" +cp_1a™ +++ +9, cj E€ R, co Æ 0. 
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The first initial condition is determined by the fact that a(t) is a density. For 
ease of exposition, assume that the remaining n — 1 initial conditions of this 
ordinary differential equation (ODE) are homogeneous,‘ i.e. 


a) (0) =0 (k=0,...,n—2). (3.5) 


Integrating (3.4) from 0 to oo and using fy a(t)dt = 1, with (3.5) the first 
initial condition then is 
al (0) = co. (3.6) 


Proposition 3.5 Let the interarrival density a(t) fulfill (3.4) with initial con- 
ditions (3.5) and (3.6) and let m(u) be sufficiently smooth. Then m(u) is the 
solution of the IDE 


d uU 
PA (5 — Z) miu) = of m(u— y) B(dy) + co w(u), (3.7) 
with boundary condition 
lim m(u) = 0. (3.8) 


Proof. Rewrite (3.1) as 


with 
g(u) = m(u — y) B(dy) + | w(u, y — u) B(dy). 


By dominated convergence and partial integration one has 


(5- <)m(u) z | i (s-<) (e-*a(t)g(u+t)) dt 


II 
oo 
o 

Š 
Q 
PETN 
S 
pe e 
MTE 
œ% 
| 
S| 
NS 
Q 
PT 
< 
ai 
eb 
Q 
= 


4Inhomogeneous initial conditions can be dealt with analogously, one just gets additional 
terms in the calculations. Recall from Chapter I that any a(t) with rational Laplace transform 
can be represented as the solution of (3.4) and general initial conditions. But already the 
subclass with homogeneous conditions (3.5) is relevant. For instance, any density which is a 
convolution of n exponential densities with parameters 3; can be expressed through (3.4) and 
(3.5) with pa (x) = [[%4(a + i). 
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Analogously we have under (3.5) 


k 


d\k ay d 
be = (k-1) —ót pee n 
(6 x) m(u) = g(u)a*-)(0)+ | eglur) alidt, k= 1....n, 


and combining these identities in such a way that (3.4) appears inside the in- 
tegral on the right-hand side, we obtain (3.7) in view of the initial conditions. 


Ordinary differential equations 


Assume now that the claim size density b(y) is also the solution of an ODE of 
the form 


d 
— }b(y) =0 3.9 
pa ( sO) = 0. (3.9) 
with the polynomial 
pp(a) = zf + de-i! +--+ do, dj E€ R, do #0 
and some initial conditions b'*)(0) (k = 0,...,£—1) (where one initial condition 
is again determined by the fact that b(x) is a density). Then the IDE for m(u) 


can further be reduced to a linear ODE: 


Proposition 3.6 Assume that the claim size density b(y) satisfies (3.9). Then, 
under the assumptions of Proposition 3.5, m(u) satisfies the ODE 


(ve($) al- E) -mo($)) mw) = omg) e0 


with the polynomial 


l-1 e 
pr(t)=cod. X. dy b*I-D(0) a! 


j=0 k=j+1 


and dg = 1. One boundary condition is (3.8) and £ more boundary conditions 
need to be specified. 


Proof. For k = 1,...,2 


du*k 


d* u k-1 = i u dk 
Tuk i m(y) b(u — y) ay) = 2 mi) (u) b. o+ f m(y) —b(u—y)dy. 
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The appropriate linear combination of derivatives of (3.7) according to (3.9) 
cancels the ee term on the r.h.s. of (3.7) and leaves instead 


co Ste YH Reem = L(Y atl) Fmt 


j=0 k=j+1 


with dọ = 1. 


It immediately follows from the representation (3.10) that the Lundberg 
fundamental equation for this model is given by the polynomial equation 


pp(s)pa(d — s) — pr(s) = 0. (3.11) 

From the definition of pg and py, it becomes clear that p;(s)/pp(s) = coB[—s], 
so that (3.11) can also be written as 

pal — 8) — co B[-s] = 0. (3.12) 

Lemma 3.7 For ô > 0, the Lundberg fundamental equation (3.12) has exactly 


n roots with positive and £ roots with negative real part. 


Proof. From (3.11) it is clear that the Lundberg fundamental equation is a 
polynomial of degree n + Z, so that it has n + £ complex roots. The location of 
the roots then follows by an application of Rouché’s theorem. 


Example 3.8 Assume that the initial conditions of (3.9) are given by 
b)(0)=0 (k=0,...,2—2). (3.13) 


b(y) is a density, so it automatically follows that b(—))(0) = dp and the inhomo- 
geneity polynomial in (3.10) simplifies to pr(x) = codo. A particular case is when 
the interclaim time is Erlang(n, 3) distributed and the claim size is Erlang(Z, v) 
distributed, in which case we have pa(x) = (£ + By with (3.5) and co = 8” 
as well as pp(x) = (x + v)! with (3.13) and do = v*. Consequently, the ODE 
(3.10) then simplifies to 


which is a popular choice in the literature. 


In order to use the ODE (3.10) for concrete calculations, one needs to de- 
termine the remaining @ — 1 boundary conditions, usually using the negative 
solutions of the Lundberg fundamental equation, which can be a quite cum- 
bersome task in general, but is possible in particular cases (for instance by the 
method of so-called integrating factors, see the Notes). An alternative is to 
determine the fundamental solution of (3.10) and substitute that back in the 
original IDE (3.7), or to use Laplace transforms. 
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Laplace transforms 


As in Section 2, the convolution term in the IDE (3.7) suggests Laplace trans- 
forms as a natural tool in this context. Since [f° e7s“m™) (u) du = s*m[—s] — 
s*—1m(0) — s*-2m!(0) — --- — m*-(0), one gets 


pals — s)m[—s] + q(s) = com|—s]B[—s] + co @[-s] 

for some polynomial q(s) of degree n — 1 and subsequently 

aise) =, as). Sn, (3.14) 

pa(d — s) — co B|—s] 

Note that the denominator in this expression is again the Lundberg fundamental 
equation, which from Lemma 3.7 for ô > 0 is known to have n roots s = 
P1,-+-;Pn With positive real part. Since the Laplace transform is an analytic 
function for s > 0, these n roots must also be zeros of the numerator, which 
determines the n coefficients of q(s). Assuming for simplicity that the roots are 
distinct, the usual Lagrange interpolation formula gives 


n a n s—p 
a(s)=co X oe M ——~. (3.15) 
j=l k=1,kgj PIT Pk 


Using m(0) = lims—oo s M|—s], it then follows from (3.14) that 


1 
—¢9 5-1 l-o] k=, ——— n n 
TO (—1)" BOP = e ee -pi) JI = 
k=1 kżj Pk pj 
(3.16) 
Since G[—s] = f° J w(x, y) b(x + y) da dy, a comparison with (1.5) now 


yields the ee formula 


n 


fæl) = obat+y) Ste? TT — (3.17) 


j=l kal, kj PR PI 


for the discounted joint density of surplus prior to and at ruin (given zero initial 
capital), expressed in terms of the zeros of the Lundberg fundamental equation. 
For the compound Poisson case (n = 1 and co = 8), (3.17) simplifies to (2.15). 

Since (2.23) holds in the present renewal setting as well, one gets the repre- 
sentation 


f Fenu- 2)45(2) de con (x+y) ) $ enese- u) II 


j=l k=L. ky 


I(x >u) 
Pk — Pj 
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of f(z, yu). Integration with respect to y gives the expression 


[ te- aslo) dz + 60 B adeno Ty 


k=l; k] 


I(x >u) 
Pk — Pj 


for f(x|u). Correspondingly, as a function of x, at x = u the discounted density 
of the surplus prior to ruin has a discontinuity of size 


wS TI 


jJ=1 k=1,k4j 


Pk — Pj 


But for n > 2 this sum equals zero, so that the discontinuity disappears! ° 
The explicit form of the Laplace transform also allows to sharpen Proposi- 
tion 3.1: 


Corollary 3.9 If the interarrival density a(t) fulfills (3.4) with initial condi- 
tions (3.5), then under the assumptions of Proposition 3.1 with a simple positive 
zero ys > 0 of K(r) = 6 and distinct roots —pi,...,—Pn with negative real part, 
one has 


~ ~ —%5 — Pk 
Blye] — jaa Olo] Teas ees ees 
lim e?"“m(u) = = J 
u00 —p',(6 + ¥5)/co + B’ly¥6] 


Proof. In view of Proposition 3.1, it suffices to determine the constant Cs. With 
the formula Cs = lim,_.9 sm[—s + ys], we obtain the result through an applica- 
tion of L’H6pital’s rule in (3.14), using (3.15) and the fact that the solution ys 
of the Lundberg fundamental equation has multiplicity 1. 


Finally, we illustrate how the simple formula (2.6) for the Laplace transform 
of the time to ruin with zero initial capital can be generalized to certain renewal 
models: 


Example 3.10 Assume that the interarrival time is a generalized Erlang r.v. 
(that is, an independent sum of not necessarily identically distributed expo- 
nential r.v.’s) with pa(x) = [J]}_,(@ + 2i) and correspondingly co = JJ; bi. 
For w = 1 one has @[—s] = (1 — Bi-s])/s, and formula (3.16) together with 


5Note that here an underlying assumption was the homogeneity condition (3.5). The 
discontinuity does not necessarily disappear if the boundary conditions for a(t) are inhomo- 
geneous, see Ren [734]. 
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coB[-p,] = pa(6— pj) for j =1,...,n (note that p; are solutions of the Lund- 
berg fundamental equation) then implies 


n 


se: 0) <o] = D (Ahapa) TE oe 


= et paj IPR Pi) 


But by an induction argument this expression can be simplified to 


Iia (6 + bi) — Br +++ Bn 
Pls * Pn ` 


s[e ör (0). T(0) < oo] =~ 


Notes and references Early studies of penalty-related quantities in renewal mod- 
els include Dickson & Hipp [316], Cheng & Tang [237], Tsai & Sun [857], Sun & Yang 
[819] and Drekic et al. [328]. Extensions to general discounted penalty functions in 
renewal models with Erlang interclaim times go back to Li & Garrido [589] and Ger- 
ber & Shiu [411]. For this model, Li [584] and Li & Garrido [589] give an alternative 
representation of the renewal equation (1.3) in terms of certain integral transforms 
T, that can be interpreted as pseudo-resolvents of the differentiation operator (evalu- 
ated at r = p1,...,Pn). These transforms turn out to be helpful in related models as 
well (originally studied by Redheffer [728]; they are nowadays usually referred to as 
Dickson-Hipp operators), see [411] for a detailed comparison of methods. Since then 
there have been numerous further papers on the subject, and the following list is by 
no means exhaustive. More general interclaim times are treated in Li & Garrido [590], 
Schmidli [778] and Song et al. [815]. Ren [734] extends Proposition 2.14 and Li [585] 
extends Proposition 2.13 to phase-type interclaim times. In Li [586], the latter result 
is used to study the time to recovery To and the maximum severity of ruin for phase- 
type interclaim times. Biard et al. [163] study the asymptotic behavior of the expected 
time-integrated negative part of the risk process. Li & Dickson [587] investigate the 
maximum surplus before ruin in general Sparre Andersen models. Willmot, Cai & Lin 
[888] derive general bounds for solutions of ren ewal equations and apply them to the 
present set-up. 

The derivation of the integro-differential equation with operators is from Constan- 
tinescu [254], where the formulation is in terms of adjoint operators. Albrecher et al. 
[21] start from (3.10) to factorize the differential operator and subsequently lift this 
factorization to the equation level, which leads to an iterative solution of first-order 
boundary value problems and allows to obtain a number of explicit expressions for 
m(u) using Grobner bases. Landriault & Willmot [572] give explicit expressions for the 
Laplace transform M|—s] for arbitrary interclaim times and Coxian claim sizes. How- 
ever, its explicit inversion is in general difficult. Section 3a is based on Schmidli [780], 
who uses the same technique to work out Lundberg-type approximations also in more 
general models including certain Cox models. The trick to use forward markovization 
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(instead of backward markovization) needs some care w.r.t. the appropriate filtration, 
but can be quite powerful. For a detailed discussion see e.g. Rolski et al. [746]. 

The idea to get rid of the discounting by modifying the interclaim distribution can 
be found in Avram & Usabel [113] and Ren [735]. 

Using duality relations to a compound Poisson model with arbitrary claim size 
distribution, a closed-form formula for the density of the time to ruin for arbitrary 
interclaim times and exponential claim sizes (in terms of an infinite series of convolu- 
tions of A) is derived in Borovkov & Dickson [189], see also Dickson & Li [317]. 

Necessary amendments of the above results for stationary renewal models are for 
instance discussed in Willmot et al. [889, 890] and Ng [664], for other delayed renewal 
models see Willmot [887]. For bounds on the distribution of the deficit, see Chadji- 
constantinidis & Politis [228] and Psarrakos [719]. The asymptotic behavior of m(u) 
for large u in the presence of heavy and semi-heavy tails depends in a subtle way on 
the shape of the penalty function w, see Tang & Wei [834] for a fine and complete 
analysis. For an extension to a model with constant interest rate, see Wu, Lu & Fang 
[895]. 


4 Lévy risk models 


As already discussed in Chapter XI, there may be certain reasons to consider 
more general Lévy processes in the risk reserve modeling procedure, let alone 
the appeal of generality on the mathematical level. Recall from the quintuple 
identity of Section XI.4e that the joint distribution of the surplus prior to ruin 
and the deficit at ruin of a general Lévy process can be expressed through 
potential measures. The resulting expression, however, usually does not lead to 
explicit expressions unless one adds further restrictions on the model. We will 
consider in the sequel two cases that admit a rather explicit treatment, namely 
the case with one-sided jumps and the compound Poisson process with two-sided 
jumps. 


4a Spectrally negative Lévy processes 


If the risk reserve process is a Lévy process that can only have downward jumps, 
then it is possible to find an integral representation of the Gerber-Shiu function 
through the corresponding scale function. As in Section XI.3, ps denotes the 
positive solution of the Lundberg equation K(s) = ô. 


Theorem 4.1 Suppose that {R;} is a spectrally negative Lévy process with 
positive drift. Then for a bounded measurable penalty function w(x, y) with 
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w(-, 0) E 0, 
m(u) = J J w(x, y) (erse wWw® (u) -W (u — x)) v(dy + x) dz. 
o Jo 

Remark 4.2 Note that ruin can occur either by a jump or through diffusion 
and the assumption w(-,0) = 0 simply restricts the discounted penalty function 
to the case when ruin happens through jumps. If ruin is caused by diffusion, 
then the problem is somewhat degenerate with Ruj- = Rr(u) = 0. In that 
case one knows from Pistorius [703] that 


2 
ife òrla), Rr) =0] = = (wu) — pa W® (u)). 


If {R:} has bounded variation, the assumption w(-,0) = 0 is not needed. Also, 
if {R,} has unbounded variation and ø = 0, the assumption is not needed for 
u > 0. 


Proof of Theorem 4.1. Our model for R, is equivalent to a spectrally positive 
Lévy process with negative drift and Ro = 0, where the ruin event then refers 
to an overshoot of level u. Hence we can directly use Corollary XI.4.7 to write 
down the (defective) joint density of the surplus prior to ruin and the deficit at 
ruin. In particular, integrating XI.(4.8) w.r.t. y and translating into the present 
notation we obtain 


P(R(u)— € dz, |R| € dy) = (wO (u) — W® (u — x)) v(x + dy) dz. 


We can now use the fact that exponential tilting by ps leaves the drift of the 
process positive and by XI. (3.8) the zero-scale function under the tilted measure 
is given by wo (u) = e7? W)(u) and the Lévy measure changes to vp, (da) = 
e ?§*y(dax). Hence we can write 


Giese 
= eps(uty) Po, (Rew) E€ da, |R(u)) € dy) 
ers utv) (WO (u) — WO (u — 2) vp (£ + dy) de 


(er 8 W(u) — W® (u — a)) v(x + dy) dz. 


From the latter the assertion follows. 


Remark 4.3 In the absence of a diffusion component (i.e. o = 0), the jumps 
larger than a fixed € > 0 form a compound Poisson process. As e — 0, this 
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compound Poisson process converges weakly to the original spectrally negative 
Lévy process. One can now use this fact to observe that a number of results 
derived for the compound Poisson process still hold for more general pure-jump 
Lévy processes. The recipe is to just replace B(1 — B(z)) by D(a). For instance, 
from (2.13) and (2.15) it follows that for a spectrally negative Lévy process with 
triplet (1,0,v) and some restrictions on the penalty function, m(u) satisfies the 
defective renewal equation m = m * g + h with 


gw) = f em vlan) 


and 


h(u) = / | eo P5(™—Way(a, y) v(dy + x) de. 
u JO 


The compound Poisson risk model with perturbation 


Consider the risk reserve process 
Ri = u+t-X_ Ui+oW, t20, (4.1) 


where N; is again a homogeneous Poisson process and {W;} is independent 
standard Brownian motion. The interpretation is that the diffusion part ac- 
counts for small perturbations of the risk process that can come from various 
sources (inaccuracies in the estimation or measurement, local deviations in the 
premium income or claim payouts etc.). As usual, the justification of using a 
diffusion to model such effects is that it can be thought of as the sum of many 
small independent effects and in the absence of further knowledge it is natural 
to assume that its drift is zero. Mathematically, (4.1) is clearly a special case of 
a spectrally negative Lévy process and so the above fluctuation theory and its 
results apply, but due to its simplicity this model can also be treated by other 
self-contained techniques which can give additional insight. 

In the following we give an illustration of this. Recall that in the presence of 
the Brownian component, ruin can occur in two ways, either by a claim (which 
results in a non-zero deficit at ruin) or by oscillation (which is also often referred 
to as creeping). Assume that the penalty for the second is given by the constant 


8In fact, it converges even almost surely uniformly on bounded time intervals, see Bertoin 
[157]. 
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w(0) = wo. With the generator technique of Chapter II it is clear that the 
discounted penalty function now satisfies the integro-differential equation 


u 00 2 
af m(u—3)B(dz)+8 | w(u, 2—u) B(de)—(9+8)m(u)-+m! (u) +> m'(u) = 0. 
l “ (4.2) 
One can now again proceed with Laplace transform techniques as in Section 2a. 
Then ; 
2 (s m(0) + m'(0)) - 6a[—sl 
k(—s)— ô : 


m[-s] = 


where now «(r) = 6B[r]—r—8 +0? r2/2. Starting with zero initial capital leads 
to ruin immediately (due to the oscillation), so that m(0) = wo. Since K(—s)—6 
has exactly one positive zero — p > 0, this must be a zero of the denominator 
as well (again 7m[—s] is analytic for R(s) > 0). Hence we arrive at 


m(—s] = T wo(s — pe eee — b|- s] ; (4.3) 


which extends formula (2.4). 

By adapting the techniques of Section 2b one can show that for general 
penalty functions w(x, y) (under mild assumptions, see Sarkar & Sen [762] for 
details) the Cramér-Lundberg approximation 


lim m(u)e%" = Cs 


u— oo 


holds for the perturbed model (4.1). The constant is again given by Cs = 
lims—o s M|—s + ys], ie. with L’Hôpital’s rule we obtain from (4.3) 


= = “wle "(64 — e72 x x wola + ps) 
s = A gi ( a | ( )b(a + y) da dy + ) 4) 


For w(x) = wo = 1, this further simplifies to 


ô 1 1 
C po 
oY) (= x) 


which formally coincides with (2.20), but note that the underlying «(r) is differ- 
ent. If furthermore 6 = 0 (i.e. m(u) = w(u)), then Co = C = (1 — Bup)/k'(7), 
in accordance with Corollary XI.2.7. 

With ô = 0 in (4.4) and the previously established fact that 6/p5 > 1— pup, 
the choice w(z,y) = 0 and wo = 1 finally gives the asymptotic probability of 
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ruin caused by oscillation to be 


oy 
2n/(7) 


Yalu) 


e77" asu>o, 


and the probability of ruin caused by a claim (w(x, y) = 1 and wo = 0) behaves 
as 


ae 1- uB = oy /2 ars 


PN a) 


as u —> CO. 


Notes and references In Biffis & Kyprianou [165], Theorem 4.1 is given in a 
more general form, where the Gerber-Shiu function also includes the size of the last 
minimum before ruin; see also Biffis & Morales [166] for a convolution-type approach. 
Chiu & Yin [245] give expressions for the duration of ruin and the time of the last 
visit of the ruin boundary for spectrally negative Lévy processes. The idea of Remark 
4.3 goes back to Dufresne, Gerber & Shiu [335]; see also Garrido & Morales [391]. 
Klüppelberg, Kyprianou & Maller [543] derive explicit asymptotic results for ruin- 
related quantities like the deficit at ruin and the surplus prior to ruin for u — oo; see 
also Doney & Kyprianou [326]. 

For a detailed study of the Gerber-Shiu function in a compound Poisson model 
with Brownian perturbation including defective renewal equations and asymptotics, 
we refer to Gerber & Landry [406] and Tsai & Willmot [858]. Lin & Wang [595] 
apply these results to the pricing of perpetual American catastrophe put options. 
The same argument as in Remark 4.3 applies to extend the resulting formulas to 
general spectrally negative Lévy processes (see Morales [649] for details). Some explicit 
calculations for this model under phase-type claims can be found in Ren [733]. For 
inclusion of interest rates see Wang & Wu [872] and a recent extension with more 
general investment is given in Avram & Usabel [114] and Wang, Xu & Yao [867]. 
Gerber-Shiu functions under Brownian perturbation in renewal models are for instance 
studied by Li & Garrido [591] and in Markov-modulated compound Poisson models 
by Lu & Tsai [611]. There have also been studies with more general perturbations 
than Brownian motion. Among them, Furrer [381] deals with a-stable motion for 
the perturbation and Chi, Jaimungal & Lin [244] use singular perturbation theory to 
deal with the Gerber-Shiu function under perturbation with stochastic volatility of 
Ornstein-Uhlenbeck type. 


4b The compound Poisson model with two-sided jumps 


Another case that admits a direct treatment is a compound Poisson model where 
jumps can be both upward and downward. Complementing the first passage 
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expressions given in Section XI.5, let us briefly revisit risk models of the type 


R = ut OR - SOU, (4.5) 


Here the linear drift t from the Cramér-Lundberg process is replaced by a com- 
pound Poisson process with i.i.d. positive up-jumps P; (with density p(x)) that 
occur according to a homogeneous Poisson process {N¥} with intensity 6“, in- 
dependent of the claims process. For transparency of the exposition, we neither 
include an additional drift term nor a further Brownian perturbation compo- 
nent, although each is easily possible, see the Notes. 

Let h > 0. By conditioning on the time and amount of the first jump before 
time h, one has 


h u 
mu) = ef ae) m(u — x)b(x) da dt 
h N co 
+ ef e 848 vie] w(u, x — u)b(x) dz dt 
0 u 


h oo 
+ BY f e7 (6+8"+ô)t l m(u + x)p(x) da dt + e7 t8" t®rm(u), 
0 0 


7 


which after differentiation with respect to h and setting h = 0 gives 
af m m(u — x)b (as f w( u, x — u)b(a)dax 
sef m(u + x)plz)dz — (8 + BY + 6)m(u) = 0. (4.6) 
0 


The function m(w) is the unique solution of (4.6), since the mapping 


x)b(x)da + x — u)b(a)dx 


m(u) > 


seers fm mr), 


m(u + x)p(x)da 
teal i 
is a contraction and has a unique fixed point. Let us again impose the boundary 
condition limy—... m(u) = 0. 

Instead of using Laplace transforms, we shall here proceed in a related, but 
slightly heuristic way and restrict the analysis to a penalty function that only 


T As before, the formal background for this type of reasoning is the generator approach of 
Section II.4a. 
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depends on the deficit, i.e. w(z,y) = w(y). Assume first that the claim size 
distribution is a combination of n exponentials, i.e. 


bc) = X Aiai, = > 0, (4.7) 
i=l 


where 0 < a, < Q2 <... < Qn and A, +... + An = 1. Some of the A;’s may 
be negative as long as b(x) > 0. Then the discounted penalty function is of the 
form 


m(u) = S Che", u20. (4.8) 
k=1 
To see this, one substitutes (4.8) into (4.6) and r),...,7n turn out to be the n 


solutions with positive real part of the generalized Lundberg equation 


Z Qi u ji —r2 u 

BY A +8 i e'" p(x) dx — (8+ 6“ +6) = 0 (4.9) 
te 0 

(potential negative solutions would violate limu—oo m(u) = 0). This equation 

has indeed exactly n solutions with positive real part (by the usual Rouché 

argument). The one with the smallest real part is real and is the adjustment 


coefficient ys < a,. The coefficients C1, ...,Cn are the solutions of 
n 
C IL 
Motes S|, i= 1,...,n, (4.10) 
a Qi — Tk Qi 


with the notation 
Co 
IL = a f w(yje “* dy. 
0 


One way to solve this system of n linear equations for C1, ..., Cn goes as follows. 
Define a rational function 


Tog 
k=1 
(note that m[—s] = —Q(-—s)). Obviously, 
Crh = lim(r—-7,)Q(r),  h=1,...,n. (4.11) 


One can now find more tractable expressions for Q(r) and apply (4.11) to these 
expressions. Note that Q(r) is completely determined by the following three 
properties: 
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e It is a rational function of the type polynomial of degree at most n — 1 
divided by polynomial of degree n. 


e Its poles are r1, ..., Tn. 
e Q(a;) = mi i =1,...,n, according to (4.10). 


The rational function 


n Tl n n a 
ray 
2 a; LI (os jT Tk) Haya 
j= = i=1,ižj 
Qı(r) = n 
IT (r= rx) 
k=1 


also fulfills these properties, and from this together with (4.11) gives a full 
specification of (4.8). 

If we now restrict to p(x) = ne~"* (i.e. exponential up-jumps), then the 
Lundberg equation (4.9) specializes to 


(eee ae ee 


In addition to r1,...,fn, this equation has one negative solution —ps. We can 
hence represent Q(r) also as 


4 sts cally — n-o) D Aral 


n+r BY A, ot — (8+ 8! +å) 


Qa(r) = 


which immediately leads to 


B (n+ra) 2 Aiz — (n- ps) 2 Aiz 
EE i= i=l 
NTh : Qi u 
b 2 Ai (ai-rn)? b Ga 


(4.12) 


Furthermore, with m(0) = lim,— +o rQ2(r) we get 


= 6 S n- ps ai tn 
mO = gryla (1+ 25%) = sean ‘ost pp 
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The class of distributions (4.7) is dense in the class of all positive distributions, 
so (heuristic, but intuitive!) one can deduce from the above that m(0) is given 
by 


pains, || vow + (1 — ps) [ww f we + y)dzdyl. 


In this general case, — ps is the negative solution of the equation 
K(r) = pr] +6" — (8+ 6") = 64, (4.13) 
n+r 


which is defined for all r > —n with B[r] < oo (see Figure XII.2, which shows 
that with up-jumps the straight line from Figure XII.1 for the classical model 
is replaced by a more general curve). 


A z 
BBir] 
ug Bn 
B+ +6 aoe 
B+6 
$ 
i 
8 
i 
T ; : >r 
Sm ps Y 
i 
i} 
l 


FIGURE XII.2 


Note that 8/(8 + 8” +6) is the discounted probability that the surplus process 
has a downward jump before the first upward jump (ruin occurs at that time), 
which explains the first summand in (4b). Let again gs(a) denote the discounted 
probability density function of the deficit at ruin for initial surplus zero (i.e. the 
discounted descending ladder height density). Because (4b) holds for arbitrary 
w(y), by choosing the Dirac-delta function we get 


b 


wl) = yim ota) f etyde], >0.(4.18) 


If the Lundberg approximation 


mslu) ~ Cre 7" as u —> œ 
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holds,® then the corresponding extension of (4.12) with ra = ys immediately 
gives 


c,- 8 So wy) So” [Cn + yae”? — (n — ps)e~??*] b(x + y) da dy 


© ntg K! (75) net) 


Corollary 4.4 The Laplace transform of the time to ruin with zero initial cap- 
ital in the two-sided compound Poisson model with exponential(n) up-jumps is 
given by 


ar ert): = 76 
sje ;7(0) < oo] = 1 : 
| Oaol = e+ FD) 
With positive safety loading, the ruin probability with zero initial capital is 
v0) = 525 A tnue). (4.16) 
aoe te 


Furthermore, the constant C in the Cramér-Lundberg approximation (u) ~ 
Ce~%” is given by 


_ Bf’ -nBpes 
eH (ou) a 


Proof. Just choose w(x) = 1 in (4.12) and use the identity 


[ [eer dydz = (Bp =<) (4.18) 


for r = —ps together with «(—p5) = ô. 
For the ruin probability, p —> 0 as 6 — 0, so W(0) follows from the limit 
6/ps > B“/n — Bug as ô > 0. Alternatively, take w = 1 and 6 — 0 directly in 
(4.12). 
Finally, the constant C follows from (4.15) by another application of (4.18), 
this time for r = yo with k(yo) = ô. 


Remark 4.5 Formula (4.16) can be reformulated as 


x eb e EAR] 
B+" B+ 6" EI’ U 


(0) 


t 


8Conditions under which a defective renewal equation for m can be derived (and hence 
a Lundberg approximation for light-tailed B exists by the key renewal theorem) can e.g. be 
found in Labbé & Sendova [570]. 
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which has the following interpretation: The first term is the probability a down- 
jump occurs before an up-jump, in which case ruin occurs at that time. If not 
(the probability of which is 8¥/(8 + 8“)), the conditional probability of ruin is 
the ratio of expected claim payments per time unit over expected income per 
time unit, which is a natural extension of Corollary IV.3.1 from the one-sided 
jumps model. 


Remark 4.6 Choosing 6 = 0 in (4.14) gives the (non-discounted defective de- 
scending) ladder height density 


goly) = spb) +0BW), o> 0, 


which is the extension of go(y) = 3B(y) for the compound Poisson process with 
one-sided jumps derived in Theorem III.5.1. 


If we take the limit 8“ — co, 7 — oo such that 6“/n = 1, then ye, PP, t, 
so the up-jumps converge to a linear drift with slope 1 and we arrive at the 
classical Cramér-Lundberg model. Correspondingly, one can retrieve from each 
of the above results the corresponding Cramér-Lundberg analogues in Section 2 
as a limit case. 


Remark 4.7 If the up-jump distribution is extended to a combination of ex- 
ponentials, in principle analogous formulas hold, but the Lundberg equation 
k(r) = ô then has additional zeros in the negative halfplane, so that the expo- 
sition gets more cumbersome and is omitted here (see the Notes below). 


Notes and references In an obvious way, one can derive the analogous expressions 
for an added Brownian perturbation in (4.5), where the polynomials in Q(r) will then 
have its degree increased by 1. Also, the inclusion of a drift term is just a notational 
issue (and was left out here deliberately for transparency, and also in order to identify 
the Cramér-Lundberg model as a simple limit). A Pollaczeck-Khinchine -type formula 
for the ruin probability in the two-sided pure-jump model was derived by Boucherie, 
Boxma & Sigman [191] in a queueing context, exploiting the observation that up- 
jumps can be equivalently described as an increase of inter-arrival times in a renewal 
model with constant premium intensity as long as the desired quantities are invariant 
to scaling of time; for early time-dependent considerations we refer to Perry, Stadje & 
Zacks [692], Kou & Wang [559] and Jacobsen [498] . Since then many papers appeared 
on the subject. The slightly heuristical procedure given above is from Albrecher, 
Gerber & Yang [23] and can be formally backed up by the usual IDE and renewal 
techniques. 


4. LEVY RISK MODELS 395 


In more finance-oriented contexts, a compound Poisson model with two-sided 
jumps and perturbation is usually referred to as a jump-diffusion (see e.g. Kou & 
Wang [560]). This model is for instance investigated by IDE methods in Chen, Lee 
& Sheu [233] and by renewal techniques in Zhang, Yang & Li [915] under weaker as- 
sumptions on the up-jumps. For methodological links between ruin theory and credit 
ratings assessments under this model assumption, see e.g. Chen & Panjer [232]. 


Related considerations of the Wiener-Hopf factorization for more general up-jumps 
with rational Laplace transform are given in Section XI.5, see also Lewis & Mordecki 
[582], Pistorius [704], Dieker [323], Levendorskii [581] with finance applications and Chi 
[243] for formulations in terms of Gerber-Shiu functions. Roynette, Vallois & Volpi 
[750] identify the limit distribution as u — oo of the surplus prior to ruin and the 
deficit at ruin in such a model. For a Sparre Andersen model with two-sided jumps, 
see Zhang & Yang [914]. 

For most of the above results, the zeros of the generalized Lundberg function play 
a crucial role. Accordingly, for possible extensions of the results to more general Lévy 
models with two-sided jumps, a fine study of these zeros is essential, see e.g. Kuznetsov 
[563] for more general scenarios that are still somewhat tractable. 

The Gerber-Shiu function has also been extensively studied for risk processes that 
are reflected at horizontal barrier b (in which case we denote it by m(u;b)), with 
one possible interpretation that above the barrier, all premium income is paid out 
to shareholders as dividends. A closely related question is then to determine the 
expected present value V(u;b) of the corresponding aggregate dividend payouts until 
ruin, a quantity that in certain economic approaches is interpreted as the ‘value’ of the 
insurance portfolio (see the Notes of Section VIII.1). If the discount factor is the same 
ô as the one for m(u), it is clear that the dynamics of the process for m(u; b) and V (u; b) 
between 0 and b are identical, differences occur only upon exit of this interval. Under 
particular model assumptions, this translates into integro-differential equations only 
differing in inhomogeneity terms and boundary conditions. For the compound Poisson 
model Lin, Willmot & Drekic [597] identified in this way the so-called dividends-penalty 
identity 

m(u) = m(u; b) — m'(b)V(u;b), O<u<b. 


See also Yuen, Wang & Li [912] and Cai, Feng & Willmot [216] for inclusion of inter- 
est rates. Gerber, Lin & Yang [407] established this identity for arbitrary stationary 
Markov risk processes with only downward jumps by a strikingly simple probabilis- 
tic argument. An even more direct argument can be used for the model (4.5) with 
exponential up-jumps, since then the dividends are paid out discretely and the com- 
pound Poisson process identity is then again obtained as a limit, see Gerber & Yang 
[415]. Extensions for Markov-modulated processes (in which case one obtains a matrix 
identity) are investigated in Li & Lu [592]; see also Cheung & Landriault [240] for 
a Markovian arrival process set-up. As discussed in the Notes to Section VIII.1, the 
literature on dividend processes is huge and is not treated in this book. A general 
formalism under which the Gerber-Shiu function, the expected discounted dividends, 
but also more general utilities of paths of the risk process can be accommodated is 
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proposed in Cai, Feng & Willmot [217]. 
Further results on discounted penalty functions for dependent risk models will be 
treated in Chapter XVI. 


Chapter XIII 


Further models with 
dependence 


Many classical results in ruin theory rest on the independence assumption of 
claim sizes, of interclaim times and the independence between claims and inter- 
claim times. However, examples of risk processes with a certain degree of depen- 
dence appeared already at several places in this book (in particular, the Markov- 
modulated and the general Markov additive processes discussed in Chapters VII 
and IX). In this chapter, a number of further risk models with dependence will 
be discussed, some of which allow a quite analytic treatment. 

Naturally, for more involved model assumptions, the possible calculations 
will be less explicit and there is a trade-off between considering a flexible de- 
pendence model (that can be calibrated to practical portfolio situations) and 
its tractability. In any case, it is crucial to understand how (possibly neglected) 
dependence may influence the actual values of ruin probabilities and related 
measures of riskiness in the portfolio. 

We will start with some general considerations on large deviations, which are 
of independent interest, but also provide a powerful tool to generalize asymp- 
totic ruin results for light-tailed claim sizes to certain dependent scenarios. It 
will turn out that for weak forms of dependence the asymptotic behavior of 
the ruin probability is still exponential, but often with a modified adjustment 
coefficient (dependence situations where the adjustment coefficient remains un- 
changed include certain types of delay in claim settlement). For stronger (long- 
range) dependence it can happen that the ruin probability becomes heavy-tailed 
although the claim sizes are light-tailed. On the other hand, for heavy-tailed 
claim size distributions we will see in Section 2 that the asymptotic ruin proba- 
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bility is relatively insensitive to weak forms of dependence (which is consistent 
with the ‘one large claim’ heuristic). Sections 3-7 then deal with some more spe- 
cific dependence models and the chapter finishes with some results on ordering 
and multivariate risk processes. 


1 Large deviations 


The area of large deviations is a set of asymptotic results on rare event proba- 
bilities and a set of methods to derive such results. The last decades have seen 
a boom in the area and a considerable body of applications in queueing theory 
and insurance risk. 

The classical result in the area is Cramér’s theorem. Cramér considered a 
random walk Sn = X;+---+ Xn such that the cumulant generating function 
K(0) = log Ee**? is defined for sufficiently many 0, and gave sharp asymptotics 
for probabilities of the form P(S,,/n € I) for intervals J C R. For example, if 
x > EX4, then 


p(= >a) ~ gacl (1.1) 
n Bov 2rn 
where we return to the values of 6,7, 0? later. 

The limit result (1.1) is an example of sharp asymptotics: ~ means (as at 
other places in the book) that the ratio is one in the limit (here n — oo). How- 
ever, large deviations results have usually a weaker form, logarithmic asymp- 
totics, which in the setting of (1.1) amounts to the weaker statement 


lim jog P (22 > z) = =n. (1.2) 
n= n n 

Note in particular that (1.2) does not capture the yn in (1.1) but only the 
dominant exponential term — the correct sharp asymptotics might as well have 
been, e.g., cye7”” or cge77"*3"" with a < 1. Thus, large deviations results 
typically only give the dominant term in an asymptotic expression. Accordingly, 
logarithmic asymptotics is usually much easier to derive than sharp asymptotics 
but also less informative. The advantage of the large deviations approach is, 
however, its generality, in being capable of treating many models beyond sim- 
ple random walks which are not easily treated by other models, and that a 
considerable body of theory has been developed. 


For sequences fn, gn with fn > 0, gn — 0, we will write fn Qn if 


. log fn 
lim —— 
noo log gn 
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(later in this section, the parameter will be u rather than n). Thus, (1.2) can 
lo 


be rewritten as P(S;,/n > £) © e7"". 
Example 1.1 We will go into some more detail concerning (1.1), (1.2). 
Define «* as the convex conjugate of k, 


k*(x) = sup (0x — K(0)) 


(other names are the entropy, the Legendre-Fenchel transform or just the Legen- 
dre transform, or the large deviations rate function). Most often, the sup in the 
definition of «* can be evaluated by differentiation: «*(a) = 0x — K(@) where 
0 = 0(x) is the solution of x = «’(@), which is a saddlepoint equation — the 
mean «’(@) of the distribution of X, exponentially tilted with 6, i.e. of 


P(X, € dx) = Eft% X, € dal, (1.3) 


is put equal to x. In fact, exponential change of measure is a key tool in large 
deviations methods. 
Define 7 = k* (x). Since 


p(= > 2) = Ele Ponce). Sn > al, 


n n 


replacing Sn by nz in the exponent and ignoring the indicator yields the Chernoff 
bound 


p(= = x) <e™, (1.4) 
n 
Next, since Sn is asymptotically normal w.r.t. P with mean nz and variance no? 
where o? = o? (x) = K” (0), we have 

P(na < Sn <ne+1.960/n) > 0.475, 


and hence for large n 


P(S,/n > x) 


IV 


Ble oo rk) ng < Sn < nat 1.960/n] 
Caer ettaye, 


V 


which in conjunction with (1.4) immediately yields (1.2). 
More precisely, if we replace S, by nx + oynV where V is N(0,1), we get 


P(S,/n > x) Ele Onz+nK(O) doynv. V> 0] 


con 1 2 
= em e 8o v ny e7 2da 
f V 2T á 


Q 
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which is the same as (1.1), commonly denoted as the saddlepoint approximation. 
The substitution by V needs, however, to be made rigorous; see Jensen [506] or 
[APQ, pp. 355-356] for details. 


Further main results in large deviations theory are the Gartner-Ellis theorem, 
which is a version of Cramér’s theorem where independence is weakened to the 
existence of «(@) = limp... log Ee?*" /n, Sanov’s theorem which give rare events 
asymptotics for empirical distributions, Mogul’skit’s theorem which gives path 
asymptotics, that is, asymptotics for probabilities of the form 


P({S|nt] iL gree € r) 


for a suitable set I of functions on [0,1], and the Wentzell-Freidlin theory of 
slow Markov walks, which is of similar spirit as the dicussion in VIII.3. 

In the application of large deviations to ruin probabilities, we shall con- 
centrate on a result which give asymptotics under conditions similar to the 
Gartner-Ellis theorem: 


Theorem 1.2 (GLYNN & WHITT [419]) Let X1, X2,... be a sequence of r.v.’s, 
and write Sn = X,+-::-+X,, T(u) = inf {n: Sn > u} and y(u) = P(r(u) < o). 
Assume that there exist y,€ > 0 such that 
(i) Kn(9) = log Ee?" is well-defined and finite for y—€<O0<y+6; 
(ii) lim sup,,_,., Ee?*" < co for —e <0 < €; 


1 
(iii) K(@) = limp+o —Kn(@) exists and is finite fory—e<O0<yte; 
n 
(iv) K(y) = 0 and «s is differentiable at y with 0 < K'(7) < co. 
Then y(u) 8 e-1 as u— oo. 
For the proof, we introduce a change of measure for X1,..., Xn given by 


F,(dzı, menden) = esna) P (dary, E LA] 


where Fn is the distribution of (X1,..., Xn) and Sn = z1 +---+2, (note that 
the r.h.s. integrates to 1 by the definition of kn). We further write g = K'(y). 
We shall need: 


Lemma 1.3 For each n > 0, there exists z € (0,1) and no such that 


a 


Sy ~ prad Sn— 
=" — ji] > n) ee Pal zat 
n n 


-| >n) Se 
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Proof. Let 0 < 0 < e where e is as in Theorem 1.2. Clearly, 


P,(Sp/n > +n) <P e95n = eTA) pin (O+7)— Kn (7) 


Hence by (iii) and (iv), 


1 x D 3 
lim sup ~ log Pn (Sa /n > +n) < Kl +q) — 84 — On 


n—>o 


and by Taylor expansion and (iv), the r.h.s. is of order —07 + o(0) as 6 | 0, in 
particular the r.h.s. can be chosen strictly negative by taking 0 small enough. 
This proves the existence of z < 1 and no such that P,(S,/n > +7) < z” for 
n > no. The corresponding claim for P,(S,,/n < ji — n) follows by symmetry 
(note that the argument did not use > 0). This establishes the first claim of 
the lemma, for Sn. 


For Sn—1, we have 


e nO (m+n) apel Sni = e78 (+n) 1,005" OXn 
= e nO (m+n) rel OHI) Sn —OXn— Kn (7) 


< enen) em | peP(O+9)Sn] MP le gaea 


P,(Sn—1/n >pu+n) 


IA 


e79 E+N) -rn (Deku (POF) /P [Re Paja 


where we used Hölder’s inequality with 1/p + 1/q = 1 and p chosen so close to 


1 and 9 so close to 0 that |p(@+ 7) — 7| < € and |q9| < €. Since [Ee7:*»] Mai 
bounded for large n by (ii), we get 


Tee x > 
limsup — log Pn (Sn-1/n > itn) < —O(ji+n) + s(p0 + ))/p 


noo Nn 


and by Taylor expansion, it is easy to see that the r.h.s. can be chosen strictly 
negative by taking p close enough to 1 and 0 close enough to 0. The rest of the 
argument is as before. 


Proof of Theorem 1.2. We first show that lim inf, log w(u)/u > —y. Let 
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n > 0 be given and let m = m(n) = |u(1+m)/f| +1. Then 


plu) > P(Sm >u) = Em [e Smt em). Sm >u] 
> lesen sen) 4s 
> Ta 
= Ëp fenstro; Oia Sj | 
ri 1+7 

> Ëp [enset fees al < Ht) 
> ; ear 

~lL+2n \p Sao 2 un 
> ep{- mi) JEn (| A< 7): 
> exp{ Magy ete) a VET 


Here P,,(-) goes to 1 by Lemma 1.3, and since km(y)/u > 0 and m/u —> 


(1+7)/p, we get 
1+ 2n 


l+n- 
Letting 7 | 0 yields lim inf,,.. log #(u)/u > —+. 
For limsup,,_,,, log p(u)/u < —7, we write 


liminf y(u) > —y 


plu) = 5 > P(r(u) =n) =htht+h+h 


n=1 
where 
n(ô) Lu(1—5)/z] 
= Pew). a=" S Pawan; 

n=1 n=n(6)+1 

Lu(1+8)/%] oo 

b= YS Pes b= E Prw=n) 
Lu(1—5)/pJ+1 Lu(1+d)/HJ+1 


and n(6) is chosen such that K,(7)/n < min {6, (— log z)/2} and 


Saf) <r, 


Snot i > a < (1.6) 


for some z < 1 and all n > n(6); this is possible by (iii), (iv) and Lemma 1.3. 
Obviously, 


P(r(u) =n) 


IA 


P(S, > u) = E nle Sn thn(7). ; Sn > u] 
eT Hn (MP, (Sn > u) (1.7) 


x 
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so that 
n(6) 
ho < eS o, 
n=l 
Lu(1—8)/7] 7 
h < em Sh em OBS, >u) 
n=n(d)+1 
Lu(1—ô)/A] ~ 
~ NSn > om 
< — yu mmoga, (|= — | ) 
oe > S m U a 1+6 
n=n(d)+1 
Lu(1—8)/ň] 1 oo j en 
—vu n —yu U2 _ 
gg pD yn/2” ae? >e ~ 1=z1/2’ 
n=n(ô)+1 n=0 
lu(1+ô)/A] lu(1+ô)/A] 
ie < eo 5 e200) = eo er 
[w(1—6)/aJ+1 [u(1—4)/f| +1 
< ge Ais 1 E, 
E H 
Finally, 
I, < P(Sn—1 < u, Sn > u) 
w(+8)/7i) + 
= in le VSntkn(Y). Sai Lu, Sa > u] 
lu(1+8)/A]+ 
< eo ` er MP ( n-1 i Op 
< À n Tb 
lu(1+0)/7]+ 
= 1 et 
—yu n 
L > me > Tz 
lu(1+8)/A]+1 


Thus an upper bound for y(u) is 
n(ô) 


2 2du z 
-yu kn) 4 ( 1) paso 
e b> e io ii +1je ; 


and using (i), we get 


R lo 
lim sup 
WFO: H 
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(1.8) 


(1.10) 


(1.11) 
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Let ô | 0. 


The following corollary shows that given that ruin occurs, the typical time 
is u/K’(y) just as for the compound Poisson model, cf. V.4. 


Corollary 1.4 Under the assumptions of Theorem 1.2, it holds for each 6 > 0 
that 


plu) È P(r(u) € (ult = 5)/x/(y),u(1 + 8)/x!(7))) - 


Proof. Since 


y(u) = hH Že, I = P(r(u) € (w(1—6)/«' (y), w(1+5)/4"(9))), 


it suffices to show that for j = 1,2,4 there is an aj; > 0 and a cj < oo such 
that Ij < cje e7%", For I4, this is straightforward since the last inequality 
in (1.11) can be sharpened to 


u ZMH) //? 
eS a 


For Iı, I2, we need to redefine n(d) as |Gu| where 8 is so small that w = 
1—46k'(y) > 0. For I, the last steps of (1.9) can then be sharpened to 


n < gw ole 
eae 1 — 21/2 


to give the desired conclusion. _ 
For J4, we replace the bound P(S, > u) < 1 used in (1.8) by 


Tel Sn = Ee Mekn (a+7)—kn (7) 


P(S, >u) < ee 


where 0 < a < € and a is so small that k(y +a) < 2aK/(y). Then for n large, 
say n > nı, we have 


Kn(aty) < 2nk(y +a) < 4nak’(7). 


Letting c11 = maxn<n, efa la+) we get 


[Bu] 
Ty, 5 exp{—(y+ a)u+ knla +7)} 
n=1 
[Bu] 
< exp{—(7 + aju} foun F 5 exp{dnax'()}} 
n=1 
< exp{—(y+a)u}c; exp{4Buar’(y)} = cie Me™", 
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where a1 = aw. 


The criteria given in Theorem 1.2 are the somewhat natural extension of 
those for the renewal model discussed in Chapter VI, as due to the independence 
of the increments X; = U; — T; condition (iv) in that case simplifies to K(7) = 
4 log E(eY 4-1 X:) = log E(e%U:-T:)) (cf. VI.(3.1)). In the renewal set-up, of 
course, also the stronger result of the Cramér-Lundberg approximation holds 
(cf. Theorem VI.3.2). 


Example 1.5 Assume the X,, form a stationary Gaussian sequence with mean 
H <0. It is then well-known and easy to prove that S» has a normal distribution 
with mean nyu and a variance w2 satisfying 


OO 


aloe l 
Jim, wn =u = Var(X1) + 2D Cou, Xe41) 
provided the sum converges absolutely. Hence 
1 1 0w? 0w? 
=kKn(0) = (nou <n) K(0) = Out = 
n n 2 2 


for all 0 € R, and we conclude that Theorem 1.2 is in force with y = —2u/w?. 


Inspection of the proof of Theorem 1.2 shows that the discrete time structure 
is used in an essential way. Obviously many of the most interesting examples 
have a continuous time scale. If {S;},.) is the claims surplus process, the 
key condition similar to (iii), (iv) becomes existence of a limit «(0) of K;:(0) = 
log Ee®* /t and a y > 0 with (y) = 0, «’(y) > 0. Assuming that the further 
regularity conditions can be verified, Theorem 1.2 then immediately yields the 
estimate 


P( sup Skh > u) 198 ee (1.12) 
k=0,1,... 

for the ruin probability Yn (u) of any discrete skeleton {Sxn},—0,1,.. The prob- 
lem is whether this is also the correct logarithmic asymptotics for the (larger) 
ruin probability w(u) of the whole process, i.e. whether 


P( sup 5; >u) SE mei (1.13) 


0<t<oo 


One would expect this to hold in considerable generality, and in fact, criteria 
are given in Duffield & O’Connell [331]. To verify these in concrete examples 
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may well present considerable difficulties, but nevertheless, we shall give two 
continuous time examples and tacitly assume that this can be done. The reader 
not satisfied by this gap in the argument can easily construct a discrete time 
version of the models! 

The following formula (1.14) is needed in both examples. Let {Ni },.9 be 
a possibly inhomogeneous Poisson process with arrival rate 8(s) at time s. An 
event occurring at time s is rewarded by a r.v. V(s) with m.g.f. ¢,(0). Thus the 
total reward in the interval [0, t] is 


where the on are the event times.! Then 


logEe®4* = B(s)(s(9) — 1) ds (1.14) 


(to see this, derive, e.g., a differential equation in t). 


Example 1.6 We assume that claims arrive according to a homogeneous Pois- 
son process with intensity 3, but that a claim is not settled immediately. More 
precisely, if the nth claim arrives at time on, then the corresponding payment 
from the company in [0n, on + s] is a r.v. Un(s). Thus, assuming a continuous 
premium inflow at unit rate, we have 


Si = y Un(t— on) = t, 


nN: On<t 


which is a shot-noise process. We further assume that the processes {Un (8s) },>0 
are i.i.d., non-decreasing and with finite limits U,,(0o) as s T oo (thus, U,,(00) 
represents the total payment for the nth claim). We let «(@) = 6( LefUn(o0) — 
1) — 6 and assume there are y,€ > 0 such that «(y) = 0 and that «(@) < o0 for 
O<yt+e. 

If the nth claim arrives at time on = s, it contributes to S; by the amount 
U,(t — s). Thus by (1.14), 


t t 
K(0) = ef (Ee®V= t-s) _1)ds — Ot = af (Ee) _1) ds — ôt, 
0 0 


1e9Un (co) 


and since Ee?¥n(s) — as s — oo, we have k(0)/t — «(@). Since 
the remaining conditions of Theorem 1.2 are trivial to verify, we conclude that 


y(u) R8 got (cf. the above discussion of discrete skeletons). 


1 Another interpretation is to consider V (s) as a claim whose distribution depends on the 
time s of its occurrence and A; as the aggregate sum of such claims. 
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It is interesting to note and intuitively reasonable that the adjustment coeffi- 
cient y for the shot-noise model is the same as the one for the Cramér-Lundberg 
model where a claim is immediately settled by the amount U,,(oo). Of course, 
the Cramér-Lundberg model has the larger ruin probability. 


Example 1.7 Given the safety loading 7, the Cramér-Lundberg model implic- 
itly assumes that the Poisson intensity 6 and the claim size distribution B (or 
at least its mean up) are known. Of course, this is often not realistic. An 
apparent solution to this problem is to calculate the premium rate p = p(t) at 
time t based upon claims statistics. Most obviously, the best estimator of Gus 
based upon ¥;_, where Fi = o(As: 0<s<t), Ap = ye U;, is Ay_/t. Thus, 
one would take p(t) = (1+ 7)A;_/t, leading to 


tA 
S% = a-t) f sds. (1.15) 
0 Ss 


With the c; being the arrival times, we have 


Ni t N, N: 
= yu, +m f Tia, = LU(1- (+m -) - (1.16) 


Let «:(a@) = log Ee®**. It then follows from (1.14) that 


) = 6 f of afl = ( (1 +n) log ~]) as — Bt = tk(a) (1.17) 


where 
kla) = ofa a[l+(1+7)logu])du — £. (1.18) 


Thus (iii) of Theorem 1.2 holds, and since the remaining conditions are trivial to 


verify, we conclude that w(u) 8 ene (cf. again the above discussion of discrete 


skeletons) where y solves (y) = 0. 
It is interesting to compare the adjustment coefficient y with the one y* of 
the Cramér-Lundberg model, i.e. the solution of 


pE —1)-(1+n) Bue = 0. (1.19) 


Indeed, one has 
yor (1.20) 
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with equality if and only if U is degenerate. Thus, typically the adaptive pre- 
mium rule leads to a ruin probability which is asymptotically smaller than for 
the Cramér-Lundberg model. To see this, rewrite first x as 


ead 


kla) = E b. (1.21) 


enee 


This follows from the probabilistic interpretation S1 2 yeaa Y; where 
Yi = Ui(1+(1+n)log@:) = Ui(1-(1+7)Vi) 


where the O; are i.i.d. uniform(0,1) or, equivalently, the V; = — log O; are i.i.d. 
standard exponential, which yields 

: ea 

Li+ (1+n)aU] 


1 
rane = [OCHA eet] = fer f genta = 
0) 


Next, the function k(x) = e” — 1 — (1 + )y*a is convex with k(oo) = o, 
k(0) = 0, k’(0) < 0, so there exists a unique zero £o = Xo(7) > 0 such that 
k(x) > 0, « > a, and k(x) < 0,0 < x < zo. Therefore 


*U 


. g k(U) 
rT. = EN T. 


_ fr ky) _ [T __ kw) 
= i irura ow : A ican 


< ora i POBA) + [mur Bley} = 0, 


using that Ek(U) = 0 because of (1.19). This implies «(7*) < 0, and since x«(s), 
k*(s) are convex with «’/(0) < 0, K* (0) < 0, this in turn yields y > y*. Further, 
y = 7 can only occur if U = 2. 


Condition (iii) of Theorem 1.2 reflects that the ruin probability still decays 
exponentially if the involved dependence is weak enough such that the logarith- 
mic average of the moment generating functions of the (light-tailed) increment 
distributions converges. As Example 1.5 indicates, this will usually only be the 
case for short-range dependence in the risk process. 

If kn(0)/Un does not converge for vpn = n, but converges for another rate 
function vn, it is also sometimes possible to derive the limiting behavior of y(u) 
by large deviations techniques. The needed technical assumptions are then 
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more involved and we just mention the type of result that one can expect in 
such situations (formulated for a continuous-time risk process): if 


1 
K(0) = lim —~ logEe?? S/a (1.22) 


ise (6) 


exists for some scaling functions a(t) : Rt — Rt and v(t) : Rt — Rt with 
a(t) v(t) T co, and there exists another increasing scaling function h(t) such 
that g(d) = limp... v(a~1(t/d)) /h(t) exists for all d > 0, then under some 
additional technical assumptions, 

lim 


1 
fay 08 ¥) = — inf [o0 sup (9d — K(6))| Sragi (1.23) 


In particular, y(u) SF e7720 w), 


Example 1.8 Consider a continuous-time stationary zero-mean Gaussian pro- 
cess {Z,} with arbitrary covariance function Cov(s, t) = E(X. X+) and let S; = 
Z,—t for some u > 0. Then (1.22) holds with a(t) = t and v(t) = t? /o?, where 
o? = E(Z?). For the choice h(t) = v(t), the expression g(d) = lim; 07/ 
(a? oF, 4a) has to be finite for all d > 0, which introduces a condition on oy. It 
turns out that one can in fact verify all the technical assumptions underlying 
the above result and obtains 


A o? A 2 
Jim St logylu) = — inf [a(d (a + n)?/2]. 
If o2 /t + o? > 0, then this formula simplifies to lim,—.. + log y(u) = —24/0?, 


which is the continuous-time version of Example 1.5 (and corresponds to short- 
range dependence of S+). 

On the other hand, for E(X. X+) = $(s°” +t?” — |s — t|?" ) we arrive at the 
case of Fractional Brownian Motion with Hurst exponent H € (0,1) (which is 


long-range dependent). From g? = t?” , one can easily derive that in this case 


ee al qaa" 
ogam OEE) = ee 


(1.24) 


which shows that the ruin probability has a Weibull-type tail. For H > 0.5 (pos- 
itive dependence of the increments) this is an instance where, despite light-tailed 
increments, the involved dependence leads to a heavy-tailed ruin probability. 
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Notes and references Some standard textbooks on large deviations are Bucklew 
[207], Dembo & Zeitouni [290] and Shwartz & Weiss [799]. 

In addition to Glynn & Whitt [419], see also Nyrhinen [667] for Theorem 1.2. 
Miiller & Pflug [652] give an elementary proof of this result in terms of exponential 
inequalities. Variants of the claims delay model of Example 1.6 can be found in 
Kliippelberg & Mikosch [545], Gao & Yan [388] and Ganesh, Macci & Torrisi [387]. 
For Example 1.7, see Nyrhinen [667] and Asmussen [67]; the proof of (1.20) is due to 
Tatyana Turova. A more general adaptive premium rule was considered in Miiller & 
Pflug [652]. Result (1.23) is from Duffield & O’Connell [331], where details on the 
derivation can be found; see also Chang, Yao & Zajic [231]. Fractional Brownian 
Motion will be discussed in more detail in the framework of Gaussian processes in 
Section 7. 

Further applications of large deviations ideas in risk theory occur e.g. in Djehiche 
[324], Lehtonen & Nyrhinen [576, 577], Martin-Léf [630, 631] and Nyrhinen [667]. 


2 Heavy-tailed risk models with dependent in- 
put 


In the previous section we saw the effect of dependence on the adjustment co- 
efficient in the case of light-tailed claims. We now turn to heavy-tailed claim 
size distributions. In view of the ‘one large claim’ heuristics from independent 
increments it seems reasonable to expect a certain insensitivity of the asymp- 
totic behavior w.r.t. dependence, as long as the dependence is not too strong. 
Various criteria (on dependence types of interclaim times, but also for possi- 
ble dependence between the arrival process and the claim sizes) for this to be 
true were given by Asmussen, Schmidli & Schmidt [101]. We give here one of 
them, Theorem 2.1 based upon a regenerative assumption, and apply it to the 
Markov-modulated model of Chapter VII. For further approaches, examples 
and counterexamples, see [101]. 

Assume that the claim surplus process {5;},.,) has a regenerative structure 
in the sense that there exists a renewal process yo = 0 < x1 < X2 < ... such 
that 


{Syo44 = Sxoho<teys~xo ’ {Sy 44 = Sxi Jo<t<x2-xı yote 


(viewed as random elements of the space of D-functions with finite lifelengths) 
are independent and the distribution of {S%,+t — Sxe boct<xne XE is the same 
for all k = 1,2,.... The zero-delayed case corresponds to Xo = xı = 0 and 
we write then Po, Eo, Yolu) etc. We let F* denote the Po-distribution of Sf, 
assume up» <0 and Egy < oo where x = v2 — x1 is the generic cycle. See Fig. 
XIII.1 where the filled circles symbolize a regeneration in the path. 
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FIGURE XIII.1 


Note that no specific sample path structure of {9+} (like in Fig. XIII.1) is 
assumed. We return to this point in Example 2.4 below. 
Define 


= S$ M, = max Sk, M*= max Sž, M =sup S;. 


Xn+1? n ae n=0,1,... t>0 


The idea is now to observe that in the zero-delayed case, {8% }„—0.1... (corre- 
sponding to the filled circles on Fig. IV.1 except for the first one) is a random 
walk. Thus the assumption 


F(x) = Po(Si >a) ~ Gz) (2.1) 
for some G such that both G € Z and Go € Z makes X.(3.3) applicable so that 


P(M* >u) ~ F(u), woo. (2.2) 


Imposing suitable conditions on the behavior of {S+} within a cycle will then 
ensure that M and M* are sufficiently close to be tail equivalent. The one we 
focus on is 


Po(M™ > 2) ~ Po(S$ >x), (2.3) 
where 
M&O = sup Syntt — Sx_ = sup Synt — Sp—1- 
OSt<Xn4+1-Xn OSt<Xn4+1-Xn 


Since clearly M > Sï, the assumption means that M and Sf are not too 


far away. See Fig. XIII.2. 
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FIGURE XIII.2 


Theorem 2.1 Assume that (2.1) and (2.3) hold. Then 


1 
|r| 


polu) = Po(M>u) ~ F7(u). 


Proof. Since M > M*, it suffices by (2.2) to show 


P(M* 
lim inf ( a 


u>œ P(M >u) 7 (2.4) 


Define 


O(u) = inf{n=1,2,...: Sž >u}, 
Bu) = inf{n =1,2,...: 5% + M9, >u} 


(note that {M > u} = {8(u) < co}). Let a > 0 be fixed. We shall use the 
estimate 


Po(M >u, MY, <a) = o(Po(M > u)) (2.5) 


which follows since 


Po(M >u, MS? <a) 


CO 


< Po(U {M} € (u-a,u)}) 


n=1 


< P(M* € (u—a,u))/P(M* =0) = o(Po(M* > u)). 
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Given € > 0, choose a such that Po(Sf > x) > (1- e)Po (MP >z), r>a. 
Then by Lemma 3.4, 


Po(M* >u) ~ Po(M* >u, So» (u) — So*(uy—1 > a) 


= DS Po(Mt <u, Sy — S% > a V (u— S$) 


n=l 


IV 


(l-e) ze o(Mž < u, M® > aV(u—S*)) 


> a2) Po max S; <u, MW, > av (u= $3) 


0<t<Xn41 


= (1-—e)Po(M >u, M% 


dta > 4) ~ (1—e€)Po (M >u). 


Letting first u — oo and next e | 0 yields (2.4). 


Under suitable conditions, Theorem 2.1 can be rewritten as 


polu) ~ PU u) (2.6) 


where B is the Palm distribution of claims and p — 1 = limi... S;/t. To this 
end, assume the path structure 


Nt 
= SUUu,-t+% (2.7) 
¿=t 


with {Z;} continuous, independent of e U:} and satisfying Z,/t “5 0. 
Then the Palm distribution of claims is 


Nx 
B(x) = K Zo DU <a). (2.8) 


Write B = toN,./ LOX: 

Corollary 2.2 Assume that {S,} is regenerative and satisfies (2.7). Assume 
further that 

(i) both B and Bo are subexponential; 

(ii) Eo2^x < œ for some z > 1; 

(iii) For some o-field F, x and Ny are #-measurable and 


T > z|) ~ Ny- B(a) 
t=1 
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(iv) Po (sup Zi > x) = 0(B(a)). 


Then (2.6) holds with p = Bus. 


Proof. It is easily seen that the r.v.’s oix U; and oix U; — x both have tails 
of order Eo N, - B(x), cf. the proof of Lemma 2.6 below. The same is true for 


Sž, since the tail of Z% is lighter than B(x) by (iv), and also for M since 


Nyx 
MM < SoU; + sup Zi. 


i=l O<t<x 


Thus Theorem 2.1 is in force, and the rest is just rewriting of constants: since 


S * 
p = 1+ lim = HE 
tooo t ox 


(see Proposition A1.4), we get 


polu) ~ 


= Exi A Í oN; B(x) dz 


= 7 Bolu). 
-=p 


Example 2.3 As a first quick application, consider the periodic model of VII.6 
with arrival rate G(t) at time t (periodic with period 1) and claims with distri- 
bution B (independent of the time at which they arrive). Assume that B € Z, 
Bo € Z, i. (i) holds. The regenerative assumption is satisfied if we take 
Xo = X1 = 0,x2 = 1,x3 = 2,..., Zą = 0 (thus (iv) is trivial). The number Ny 
of claims arriving in [0, 1) is Poisson with rate 6 = fe G(s) ds so that (ii) holds, 
and taking F = o(N,), (iii) is obvious. Thus we conclude that (2.6) holds. 


Example 2.4 Assume that S, = W: — t+ yas U; where DOA U; — t} is 
standard compound Poisson and {W;} an independent Brownian motion with 
mean zero and variance constant o?. Again, we assume that B € Z, By € Z 
then (iv) holds since the distribution of supọ<+¿<ı W (t) is the same as that of 
|W,|, in particular light-tailed. Taking again yo = xı = 0, X2 = 1, X3 = 2,..., 
we conclude just as in Example 2.3 that (2.6) holds. In particular, note that the 
asymptotics of polu) is the same irrespective of whether the Brownian term W; 
in S, is present or not. 


2. HEAVY-TAILED RISK MODELS WITH DEPENDENT INPUT 415 


We now return to the Markov-modulated risk model of Chapter VII with 
background Markov process {J;} with p < oo states and stationary distribution 
a. The arrival rate is 6; and the claim size distribution B; when Jj, = i. We 
consider the case where one or more of the claim size distributions B; are heavy- 
tailed. More precisely, we will assume that 


s>% G(x) 


for some distribution G such that both G and the integrated tail {°° G(y) dy 
are subexponential, and for some constants c; < oo such that c1 +--+: +c, > 0. 
The average arrival rate @ and the Palm distribution B of the claim sizes are 


given by 
p 12 
E Tifi B = > mibi B 


and we assume p = uB = $}; TibilB, < 1. 


Theorem 2.5 Consider the Markov-modulated risk model with claim size dis- 
tributions satisfying (2.9). Then (2.6) holds. 


The key step of the proof is the following lemma. 


Lemma 2.6 Let (Nı,..., Np) be a random vector in {0,1,2,...} P, x 2> 0a 
r.v. and F a o-algebra such that (Ni,...,Np) and x are F-measurable. Let 
{Fi}i=1,....p be a family of distributions on [0,00) and define 


aay 


where conditionally upon F the Xij are independent with distribution F; for 
Xij. Assume EzNit'+%v < œo for some z > 1 and all i, and that for some 
distribution G on [0,00) such that G € F and some c1,...,Cp with c1 +--+ cp 
> 0 it holds that F(x) ~ G(x). Then 


p 
P(Y >x) ~ cG(x) where c= ye UN; . 


Proof. Consider first the case x = 0. It follows by a slight extension of Section 
X.1 that 


P(Y >| F) ~ Gla OPen a P(Y>a|F) < Cale) the 
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for some C = C(z) < co. Thus dominated convergence yields 


PRA = u( PSI) _, B(S aN) = e 


In the general case, as x — co, 


P 
PY >| FA) = P(Y>xt+2|F7) ~ G+) Y aN ~ Ga)> aM, 


i=1 i=l 


and 
PY, >2|F) < P%>2|F) < CGa)eMt +>, 


The same dominated convergence argument completes the proof. 


Proof of Theorem 2.5. If Jo = i, we can define the regenerations points as the 
times of returns to i, and the rest of the argument is then just as the proof of 
Corollary 2.2. An easy conditioning argument then yields the result when Jo is 
random. 


For light-tailed distributions, Markov-modulation typically decreases the ad- 
justment coefficient y and thereby changes the order of magnitude of the ruin 
probabilities for large u, cf. VII.4. It follows from Theorem 2.5 that the effect 
of Markov-modulation is in some sense less dramatical for heavy-tailed distri- 
butions: the order of magnitude of the ruin probabilities remains [°° B(x) da. 

Within the class of risk processes in a Markovian environment, Theorem 2.5 
shows that basically only the tail dominant claim size distributions (those with 
ci > 0) matter for determining the order of magnitude of the ruin probabilities 
in the heavy-tailed case. In contrast, for light-tailed distributions the value of 
the adjustment coefficient y is given by a delicate interaction between all B;. 


Notes and references Theorem 2.5 was first proved by Asmussen, Fløe Henriksen 
& Kliippelberg [76] by a lengthy argument which did not provide the constant in front 
of Bo(u) in final form. An improvement was given in Asmussen & Højgaard [80], and 
the final reduction by Jelenkovic & Lazar [504]. The present approach via Theorem 
2.1 is from Asmussen, Schmidli & Schmidt [101]. That paper also contains further 
criteria for regenerative input (in particular also a treatment of the delayed case which 
we have omitted here), as well as a condition for (2.6) to hold in a situation where the 
inter-claim times (T),T2,...) form a general stationary sequence and the U; i.i.d. and 
independent of (Ti, T2,...); this is applied for example to risk processes with Poisson 
cluster arrivals. See also Araman & Glynn [52]. For further studies of perturbations 
like in Corollary 2.2 and Example 2.4, see Schlegel [766] and Zwart, Borst & Debicki 
[923]. In the latter reference, situations are identified under which perturbations by 
general Gaussian processes do change the asymptotic behavior of y(u). 
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3 Linear models 


Let us consider a discrete-time risk model, where Rn = u+ Z,+---+Z, denotes 
the surplus of the portfolio at the end of year n and Zn, correspondingly the gain 
incurred in year n (of course any other time unit may be considered). Assume 
the auto-regressive moving average (ARMA) structure 


Zn = a, Zn—1 PEF amZn—m T Xn = by Xn-1 ee be Xn—k; (3.1) 


where X1, X2,... are iid. r.v.’s with E[X,] > 0 and ay,...am,b1,...b, are 
constants. In compact notation one can write 


with the polynomials p(x) = 1—a,2—---—a@ma™ and q(x) = 1+b,a+---+bpa* 
and A the backward shift operator. Assume that q(1) > 0, p(x) and q(x) do 
not have any common factor and all zeros of p(x) lie outside the unit disk of the 
complex plane (hence p(1) > 0). 


Proposition 3.1 Assume that {Rn} follows an ARMA structure of the form 
(3.1) with the above assumptions and with given initial values z,...,;2—m+1; 
Xo,---,L-pi1. Assume further that a positive solution r = y of the adjustment 
equation E[e~"**] = 1 exists. Then 


P(r(u) < œ | Z0; +- -, Z—m+1; T0; oes) 
exp{—7(u + Veo (met bize) /b'} 


[exp { — (Rr) + Deol icey bi) X-u) /b'} | T(u) < œ] (3.2) 


—m—k+1 are determined 


pees 


by (3.5). 


Proof. One can equivalently express the ARMA model through a moving average 
(MA) model 


Zn = Xn + X b Xn—e, (3.3) 
l=1 


where (for instance) Xn = 0 for n < —m-—k and b, is determined by 


gle) /r(e) = 1+ > ve. (3.4) 
P=] 


418 CHAPTER XIII. FURTHER MODELS WITH DEPENDENCE 


The needed m additional starting values £—ķ,...,%—m-k+1 can then be deter- 
mined in such a way that 


m+k+n—-1 
Zn = In + bptn—e for n=0,...,-m+1 (3.5) 


(which is a linear system of equations with a unique solution). From the loca- 
tion of the zeros of p(x), it follows that b, tend to zero exponentially fast and 
consequently X5% llb] < co. Define 


b =1+ġ b = a S00); (3.6) 


It is not difficult to check that exp { — y(Rn + Deo (e044 Of) Xn—0) /0'} is 
a martingale and the assertion then follows as usual by optimal stopping for 
t(u) AT and T —> co (for bounded r.v. X;, this limit operation can be justified 
by dominated convergence; for the unbounded case more work is needed, see 
Promislow [716]). 


Remark 3.2 The appearance of the factor b’ in the above result is natural 
since in view of (3.1) the overall contribution of X,, to the surplus over time 
is b'X,. Hence a ‘fair’ comparison of the ARMA model (3.1) (or equivalently 
(3.3)) with a classical risk model with independent increments would be to 
consider for the latter Rn = = ù+ Za +- + Zi where Zn = = b'Xn and u = 
u + X ecol 041 b;)£-e is the sum of M of all the deterministic 
starting values. The adjustment coefficient in this independence model is then 
y= y/b. Hence the adjustment coefficient of the ARMA model and the one of 
its independence counterpart are equivalent, i.e. the introduced dependence is 
weak enough to leave the adjustment coefficient unchanged. 


Notes and references Proposition 3.1 is (for bounded r.v.’s) given in Gerber 
[400, 401] who also showed that for a finite-order MA model the denominator in (3.2) 
converges to a constant for u — oo, which establishes a Cramér-Lundberg approxi- 
mation. Promislow [716] extended the proof to unbounded r.v.’s and slightly weaker 
conditions on the coefficients of the resulting MA model. Chan & Yang [229] include 
a force of interest and consider separate time series for the premium income and the 
annual claim payments. Particular cases of the ARMA model have immediate interpre- 
tations for a credibility model (see [401]) as well as for models including underwriting 
cycles effects on premiums and certain IBNR models for delay of claim payments (see 
e.g. Trufin, Albrecher & Denuit [853, 855]). Linear processes of the above type can 
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also be addressed by large deviation techniques, which leads to logarithmic asymp- 
totics only, but asymptotic information about the time of ruin can then be achieved as 
well, see e.g. Nyrhinen [666]. Other types of short-range dependence structures are e.g. 
discussed in Albrecher & Kantor [32] and Afonso, Egidio dos Reis & Waters [7], where 
the size of an annually changing premium may depend on previous loss experience. 
An extension of Cramér-type estimates to certain non-Gaussian long-range dependent 
processes of fractional auto-regressive integrated moving average (FARIMA) type is 
given in Barbe & McCormick [133]. Two-sided infinite-order MA processes with reg- 
ularly varying tails were investigated by Mikosch & Samorodnitsky [640] and it was 
shown that under a tail-balance condition and some conditions on the coefficients (that 
imply short-range dependence!) the asymptotic ruin probability has the same asymp- 
totic order as the case with independent increments given in Theorem X.3.1 (namely, 
the tail of the stationary excess distribution of the increments), but the constant in 
front changes (a similar conclusion is found for a first-order AR process with random 
coefficients in Konstantinides & Mikosch [550]; see also Hult & Samorodnitsky [486] 
for a recent extension to general two-sided linear processes). Barbe & McCormick 
[134] show that for non-stationary and long-range dependent FARIMA processes with 
regularly varying innovations this insensitivity no longer holds and the asymptotic or- 
der changes. Mikosch & Samorodnitsky [641] study the ruin probability of stationary 
ergodic symmetric a-stable processes for a € (1,2) and show that its asymptotic de- 
cay can become significantly slower than the one for independent increments; further 
refinements of these results are given in Alparslan & Samorodnitsky [44]. 


4 Risk processes with shot-noise Cox intensities 


Consider the surplus process Ry = u+t— yo U; with i.i.d. claim sizes U; 
that are independent of N;, but now the claim number process N; is a doubly 
stochastic Poisson process (Cox process) with a Poisson shot noise intensity 
process of the form 

B= B+ X h(t- on, Ya), (4.1) 

neN 

where {o,}nen is the sequence of arrival epochs of a homogeneous Poisson pro- 
cess of rate Ç, {Yn}nen is an iid. sequence of positive r.v. (with distribution 
function Fy) independent of the Poisson process, and the function h(t, x) is 
non-negative with h(t, x) = 0 for t < 0 (here 8 > 0 is assumed to be constant). 
An interpretation of model (4.1) is as follows: In addition to the occurrence 
of ‘normal’ claims described by a homogeneous Poisson process with constant 
rate 3, there are also claims triggered by external events (such as natural catas- 
trophes). These events occur at times {on}nen (according to a homogeneous 
Poisson process with rate ¢). Due to reporting lags of the claims that originate 
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from a given external event, the resulting increase in intensity will develop ac- 
cording to the function h(t — on, Yn). This model captures the effect that such 
events can lead to a dramatic increase of the number of claims, whereas the 
individual claim sizes still follow the same distribution B. Figure XIII.3 shows 
a sample path of the intensity 6, for h(t,r) = re! (t > 0) (with Y, being 
exponential(1), 6 = 0.5 and ¢ = 0.7). 


4 


3 


PA NA 


FIGURE XIII.3 
Let H(t, y) = re h(s, y) ds and 


0 1 


Ay = [ pas = Btt+ ) H(t- on, Yn). 
0 


neNn 


The limiting average claim amount arriving per unit time turns out to be u = 
(G+ CEH(co, Yi))uz and the safety loading condition here is u < 1. Similarly 
to (1.14), one can derive by a differential equation in t that 


t tie 
log E4 = Bt (BO) siie e ( ry (efs Hw-s YBO) -1 dwy _ 1) ds 
0 


= 6t(B(6)-1) +f by [ePO-DEY)] _1)ds, (4.2) 


where A, = yaar X; again denotes the aggregate claim size at time t. For 
Sı = A; — t and x;(0) = log Ee®** we then have «;(0)/t = «(0) with 


K(0) = B(B()-1)-6+¢ (By (eB OD) 1). (4.3) 


Theorem 4.1 Let both the m.g.f. B(0) and Eexp (6 H(oo,Y)) exist for all 0 in 
a neighborhood of the origin and be steep (cf. p.91). Then, the risk process with 
claim occurrence according to the shot-noise intensity process (4.1) satisfies 


1 
lim —logy(u) = —7, 
u—oco U 


where y is the positive solution of k(y) =0 and «k is given by (4.3). 
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Proof. Considering a discrete skeleton {Anp;}nen, (4.3) implies that knp(0)/n 
has a limit of the form 
KO) = hr(0). 


Since an easy calculation shows that x) (0)” > 0 for every 0 > 0, K™(0) = 0 
and K’ (0) = h(u—1) < 0 by the net profit condition, it follows that Kl)! (-y) > 
0. Here the required steepness implies that «(@) is unbounded in a neighborhood 


of its abscissa of convergence and hence guarantees the existence of the solution 


y > 0. Consequently, Theorem 1.2 applies and P(max, Snn > u) ergs 


Finally, since 
max Sı > Max Snh > max St— h, 
N 


the maximum over nh can be replaced by the continuous time maximum over t 
and the theorem follows from y(u) = P(max; 5; > u). 


We now intend to refine Theorem 4.1. For that purpose, consider the com- 
pound Poisson batch process R+, which is obtained by moving all arrivals of 
claims that are caused by a catastrophic event at a, to an. This risk process 
has intensity 6 = 6+ ¢ for arrivals of claims and a claim size distribution B 
which is a mixture of B and the distribution of the random sum Z = ae ) Ui, 
where N(Y) is Poisson with parameter H(oo,Y) given Y and independent of 
the U;; the weights are B/B, resp. Ċ/B, and the premium rate is 1 (Z can be 
interpreted as the total claim amount caused by the N(Y) claims triggered by 
a specific event). Let L be the time from the event until the last of the N(Y) 
claims occurs and w(u) the ruin probability of this compound Poisson batch 
process. Obviously, y(u) < Y(u). 


Theorem 4.2 For some constant C_ > 0, C_e~™ < y(u) < e~™ for all u. 


Proof. The upper inequality is clear from Lundberg’s inequality for y(u). For 
the lower, it is well known that Rz(u)— has a limit distribution given F(u) < oo 
as u — oo (see Proposition V.7.4). Hence there exists an A such that 


P(F(u) < 00, Rew < A) > (1—e)#(u) (4.4) 


for all large u. Define the pre-7(u) occupation measure Q( by 


Fu) 
Qm(G) = if I(S;€G)dt, GC (-oo,u). 
0 
Then the 1.h.s. of (4.4) is 


a Ba — B(u — x)) Q (dx) 
u—-A 
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which is bounded above by BQ (u — A,u). Clearly, we can choose ¢, with 
P(Z > A, L < 4&4) > 0. Every ruin event for R, will also cause ruin for Rz, if 
the initial surplus u is lowered by 41, given that the variable L corresponding 
to the batch claim causing ruin does not exceed 1. Moreover, considering the 
situation only where the surplus prior to ruin is bounded above by A, we obtain 
a lower bound for the ruin probability of Ri, getting 


yp(u— 4) > A BP(Z>u—a, L<%) Q (dx) 
u—-A 
> BQM(u-A,u)P(Z> A, L<&) 
> PZ>A,L<&)(1—-ev(u). 


Appealing to the Cramér-Lundberg asymptotics for v(u), the proof is complete. 


Let us now turn to heavy-tailed claim size distributions B. 


Theorem 4.3 Assume both B € SY and Bo € Z, and Ee?" >Y) < oo for some 
0 >0. Then 


plu) ~ r Bolu). (4.5) 


In the proof, we shall employ coupling with the batch process Ri defined 
above. Clearly Sı > Sı in the sense of sample paths, and so it is trivial that 
w(u) < y(u). The next lemma shows that (u) has the claimed asymptotics, 
establishing the asymptotic upper bound in (4.5). 


Lemma 4.4 Under the assumptions of Theorem 4.3, (u) ~ ia Bo(u). 


Proof. Conditioning upon Y, we get 
ZNO) = bexp{ H(co, Y)(z— 1)} 


which, under the assumptions of Theorem 4.3, is finite for some z > 1 (implying 
that P(N(Y) = n) decreases geometrically fast in n). Hence Lemma X.2.2 
implies 


P(Z >x) ~ EH(co, Y)Bo(z) (4.6) 


and subsequently 


D B+CEH(œ%,Y) 


1-B(z) ~ ag 
~ (6+ CEH(0,Y)) fea u a 

1 — Bolz) ~ E B(z)dz = =——Bo(x) = Bolz). 
(2) TA f ede = sh Bola) = Bola) 
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Finally, we have by Theorem X.2.1 


Proof of Theorem 4.3. Consider the aggregate claim process A, obtained from 
A; by moving all claims triggered by a catastrophic event and occurring at most 
Lo time units later to all occurring precisely fọ time units after the catastrophic 
event, whereas claims occurring more than lọ time units later are deleted. Then 
y(u) > Ņ(u) for all u. Standard results on translation of Poisson processes imply 
that the restriction of A; —t to t € [€9, 00) is an ordinary Cramér-Lundberg risk 
process, and by reasoning as in the proof of Lemma 4.4, we obtain 


where u(lo) = ug (8 + CEH (fo, Y)). Now 
sup (s-t) > (Ag, — fo) + sup (Ay — Ap, — (t — f0)). (4.8) 


tE[0,00) tello, co) 


Here the two terms are independent. Since Ap, is the sum of a Poisson(G0o) 
number of claims, P(Ay, — lo > u) ~ blo B(u), which is dominated by (4.7). 
Hence the tail of sup,¢jo,.0)(At — t) is asymptotically given by (4.7), and we get 


vu) du) _ _ mlo) 


lim inf = > liminf = = : 
u=>œ Bo(u) u> Bo(u) 1 — p(Lo) 


Letting lo —> co and using (fo) T u, we obtain 
y(u) H 


lim inf = ——. 


Combining this with the bound y(u) < (u) and Lemma 4.4 completes the 
proof. 


Remark 4.5 Note that for both light- and heavy-tailed claims, the asymp- 
totic behavior of the ruin probability is the same as for the compound Poisson 
batch process, which is the process where all claims triggered from a particular 
event occur directly at that time as one ‘batch claim’. In other words, on an 
asymptotic scale, the ruin probability turns out to be insensitive to the intro- 
duced dependence of claim arrivals (delay of claim arrivals, respectively) in this 
model. 
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Notes and references The risk model with a Poisson shot-noise intensity was 
first proposed in Dassios & Jang [274] for the specific form h(t, x) = xe™* which makes 
Rt a piecewise deterministic Markov process and then in principle enables an analysis 
with tools developed in Embrechts, Grandell & Schmidli [345]. Palmowski [678] uses a 
generator approach to derive an upper bound for the ruin probability for a general class 
of Cox processes generated by a diffusion process. For the estimation of the intensity 
from claim data, see Dassios & Jang [275]. The results given above are from Albrecher 
& Asmussen [12], where some further results on the corresponding aggregate claim 
sizes, finite horizon ruin probabilities and the inclusion of adaptive premium rules 
can be found. It is also possible to add a further stochastic process in (4.1) that 
represents some transient behavior. The particular choice v = ence h(t — on, Yn) 
then makes the resulting process stationary in time, in which case R; is a Poisson 
cluster process and Theorem 4.3 is covered by Theorem 3.1 of Asmussen, Schmidli 
& Schmidt [101]. Albrecher & Macci [34] provide sample path large deviations for 
the ruin probability of such a model in a Bayesian framework, where there is some 
uncertainty about involved parameters (see also Macci & Petrella [619]). Another 
approach to model ruin probabilities in the presence of catastrophes can be found in 
Cossette, Duchesne & Marceau [258]. 


5 Causal dependency models 


Most of the models discussed so far contain dependence between claim sizes 
and/or their occurrence times through some common environment conditions. 
However, sometimes a causal dependence model may be needed in practice, 
where for instance the size of a claim determines the distribution of the next 
interclaim times (think e.g. of insurance of earthquake damages, where a large 
claim coming from an earthquake event may be followed by one of an afterquake 
etc.). It turns out that an example of a dependency model of that kind, where 
each interclaim time depends on the size of the previous claim, can conveniently 
be embedded in a semi-Markovian framework and in that way even allows ex- 
plicit formulas for the ruin probability and related quantities. To see this, con- 
sider the surplus process Ry = u+t— eG U; with i.i.d. claims U; (and generic 
claim size distribution U), and assume that the time T;,1 between the ith claim 
U, and the (i + 1)th claim U;+1 is exponentially distributed with parameter 8; 
if U; € Fj, where (F;)j=1,...,1 is a (possibly random) partition of the positive 
halfline. 


Let on the other hand {Zn }n>0 be an irreducible discrete-time Markov chain 
with state space {1,..., M} and transition matrix P = (p;;)1<i,j<m and con- 
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sider the semi-Markovian model 
Pat < £, Unyi <n = Í | Zn = i, (Tr, Ur, Zr), 0 <r < n) 
= (L-e**)p,;B;(y). 

Then the choices pj; = P(U € F;) and B; ~ U|U e F}; exactly correspond to 
the above causal dependency model. The net profit condition in this model is 
D Tilli < DA mG, |, where m = (m1, ..., Tm) is the stationary distribution 
of {Zn} and pi is the mean of the distribution B;. Let m;(u) denote the Gerber- 
Shiu function (cf. XII.(1.1)) given that Z = i. By the usual conditioning 


technique on the time interval (0,dt), or (more formally) using the generator 
approach, one obtains the system of IDEs (i = 1,..., M) 


M p M iay 
ms(1s) —(Bi+8)mi(u) +6: Y pis f m;(u—y)B;(dy)+8: Y pis f w(u, y—u) By (dy) = 0, 
j=1 9 j=l a 


and via Laplace transforms we arrive at the matrix equation 
((s — ô)I — A + A P B{[-s]) m-s] = m(0)— A Põ|-s], (5.1) 


where m(u) = (mi(u),...,mar(u)), M[—s] = (M1 [-5],...,Ma[-s]), O[-s] = 
(@1[-s], Lae ,Ou[-s]) with @;[—s] = Ls ere SE w(x, y—x) Bi(dy)dz and A = 
diag(61,. . ., Ba), B(-s) = diag (By [—s],..., Bul-s]). As usual, we assume the 
boundary condition lim,.. m;(u) =0 (i = 1,..., M). 

First, the quantities m;(0) have to be determined. For that purpose, denote 
A;(s) = (s — ôI — A + A P B|-s]. The equation 


det(As(s)) = 0 (5.2) 


now generalizes the Lundberg fundamental equation XII.(2.2). By a combina- 
tion of complex analysis and linear algebra, one can show that (5.2) has M zeros 
pi,-+.,pm with R(p;) > 0 for 6 > 0 and det(Ao(s)) = 0 has one zero pı = 0 
and M — 1 zeros p2,..., pm with R(p;) > 0 (see [5, 19] for details). 

The m;(u) are bounded functions due to the boundary conditions, so m,;[—s] 
are analytic functions for R(s) > 0 (for s = 0 we further need integrability 
of m;(u)), and for each of the M zeros p1,..., pm we can now proceed in the 
following way: determine a non-trivial solution k; of 


A} (pi)ki = 0 
for each i = 1,..., M. Since we then have 
a) A T 
0 = mlp)" As (pi)ki = (m(0) = A P&|—p)]) ki, 


this gives M linear equations for m1(0),...,7™a¢(0). 
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Remark 5.1 For 6 = 0, the zeros pi,..., 0 can always be obtained numeri- 
cally. Moreover, if the involved claim size distributions have a rational Laplace 
transform, then m(u) can be obtained explicitly by inversion of the Laplace 
transform of the solution of (5.1). 


Example 5.2 To see how this can be put into practice, consider a causal de- 
pendency model, where the (n + 1)th interclaim time T,,+41 is exponential( 61) if 
Un > On for some random threshold ©,, and T;,,1 is exponential (2) if Un < On. 
This corresponds to M = 2 and 


1 P(© < y)dB(y) and dB2(y) = P i 


PO >U) P(O > y)dB(y) 


dBi) = BE SH 


and p;ı = P(U > ©) and p;2 = P(U < ©) for i = 1,2. Let © be exponential(2) 
and B exponential(1), 6, = 1.5, G2 = 0.5. Then 


p= (3/3 ae a(i 0 yi 


2/3 1/3 0 0.5 
a 3 1 1 a 3 
B = B — = 
ils Ces i): nel aa 
For 6 = 0, we obtain the determinant 
65s —3 4s 
det A = 3- 8s +434 
e o(s) s + 48 l+s 348° 


which has one zero pı = 0 and one positive zero p2 = 1.226, the two remaining 
zeros s = —0.065 and s = —3.161 are negative. E.g., for the ruin probabilities 
one obtains 


1 (u) = 0.007 e7? 1614 40.938 e~ 9°",  po(u) = 0.003 e7161 "+0.867 oe -0° ™, 


Notes and references The explicit treatment of causal dependency models of the 
above kind can be found in Albrecher & Boxma [18, 19]; see also Adan & Kulkarni [5 
for related dependency models in a queueing context. An extension to MAP is given 
in Cheung & Landriault [240]. Note that for time-independent quantities (6 = 0) the 
change of the Poisson intensity can also be reinterpreted as a change of the premium 
intensity for constant Poisson intensity and so an equivalent interpretation of the above 
model is to have dependence of the premium intensity between two claims on the size 
of the previous claim. Extensions of the model to include diffusion perturbation are 
studied in Zhou & Cai [917]; for an investigation of the Gerber-Shiu function for more 
general Markovian arrival processes via a fluid flow approach that avoids determining 
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the roots of the Lundberg fundamental equation, see Ahn & Badescu [8] and the 
recent survey Badescu & Landriault [119]. Yang [900] studies ruin-related quantities 
for a risk process that is itself a Markov chain, which also has relevance in credit risk 
applications. 

Finite-time ruin probabilities for regularly varying claim sizes and dependence that 
varies according to a Markovian environment process are studied in Biard, Lefévre & 
Loisel [162]. 

Portfolios of life-insurance contracts contain certain dependencies that are different 
from the ones of non-life portfolios. For the calculation of ruin probabilities in such a 
situation we refer to Frostig & Denuit [378]. 


6 Dependent Sparre Andersen models 


As discussed in Section VI.3a, in the Sparre Andersen model the representation 


Ry, = ut > (Tk U), n20, 
k=1 


reveals an imbedded random walk structure of the risk process with independent 
increments Tk — Up (which is the difference of the inter-occurrence time and the 
claim size). This random walk description enables the application of a number 
of classical random walk techniques to the sudy of ruin probabilities and related 
quantities. If one now assumes that Tẹ and Uz, are not independent, but have 
some joint distribution, then the random walk structure is still preserved as 
long as (Ty, Ux), k > 1 is an i.i.d. sequence of bivariate random variables. In 
other words, one can allow the inter-occurrence time and the following claim 
to be dependent (which will change the increment distribution of Tk — Up) and 
still use the random walk framework. Recall that A and B are the distribution 
functions of the r.v. Tẹ and Ux, respectively. Let «(s) denote the c.g.f. of the 
increment r.v. Tk — Uk, i.e. eX(s) = Ees(Tk-Ur), Tf the dependence between Tk 
and U;, is described by a copula function C(a, b), then a simple calculation gives 
that «(s) (in its domain of convergence) is given by 


Bils- a Porro forte (C(a,b) — ab) dA~1(b) dB} (a). (6.1) 


This formula shows quite explicitly how the dependence structure (expressed 
through the copula) and the marginal distributions A and B influence the shape 
of «(s). In particular, for independent inter-occurrence times and claims we have 
C(a, b) = ab, so the second term in (6.1) represents the correction for the intro- 
duced dependence. Since a number of asymptotic random walk properties can 
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be read off from the shape of «(s), one can now study the effect of dependence by 
investigating the resulting «(s). For instance, it is clear from (6.1) that positive 
quadrant dependence between T and U (i.e. C(a,b) > ab for all 0 < a,b < 1) 
implies that «(s) is for all s smaller than the one for independence. In case 
an adjustment coefficient y exists, it will be the solution of «(s) = 0 and so y 
will be larger for this kind of positive dependence. More generally, whenever 
there is concordance ordering for two copulas (i.e. Ci(a,b) > C2(a,6) for all 
0 < a,b < 1), then y > %2. Also, the minimum of «(s) (which is modified 
through the dependence) reveals convergence rates of finite-time ruin proba- 
bilities (see the related Theorem V.4.5 and Veraverbeke & Teugels [864]). For 
particular choices of the copula and the marginal distributions, explicit expres- 
sions are possible. 


Notes and references The model discussed in this section was introduced in Al- 
brecher & Teugels [36], where asymptotics of finite- and infinite time ruin probabilities 
and their orderings were investigated. Boudriault, Landriault & Marceau [192], Cos- 
sette, Marceau & Marri [262], Badescu, Cheung & Landriault [117] and Ambagaspitiya 
[47] establish explicit formulas for the ruin probability and Gerber-Shiu function for 
specific dependence structures within this model. An approach based on defective re- 
newal equations is given in Cheung, Landriault, Willmot & Woo [241]. For a survey 
on dependence concepts and copulas in general, see e.g. Joe [508], Nelsen [656] and 
McNeil, Frey & Embrechts [633]. Models in which dependence is introduced through 
the aggregation of several lines of business are discussed in Section 9. 


7 Gaussian models. Fractional Brownian mo- 
tion 


When modeling the reserve process R or the claim surplus process S$ = u— R 
of an insurance company, individual claims may be more or less important to 
take into account compared to aggregation. That is, one may either choose to 
incorporate jumps such as in the Cramér-Lundberg model or Lévy processes, or 
to use a continuous approximation. 

As examples of continuous approximations, we have already seen Brownian 
motion and more general diffusions. However, in the overall class of stochastic 
processes the most obvious other model choice that comes to mind is Gaussian 
processes. In fact, this alternative has within the last decade become popular 
within the area of queueing theory, and in many cases the problems studied 
there have as their main ingredient a ruin problem, as explained in more detail 
below. 

A process {X+} (with t > 0 or —co < t < co) is Gaussian if for all tı < te < 
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... < tp, the joint distribution of X;,,X4,,...,X¢, is p-dimensional normal. By 
properties of the multivariate normal distribution, all that is needed to specify 
the distribution of the process, is the mean function EX; and the covariance 
function Cov(X;z,, Xz.) (as usual, we also assume D-paths). 

The process is stationary if the distribution remains the same if ti < t2 < 
... < tp above are replaced by s+ t),s+tg <...<s+t, for arbitrary s. In the 
specification, one then only needs u = EX, (that is independent of t) and the 
covariance function. The process has stationary increments if the distribution 
of X(t +s) — X(t) only depends on s. The mean of X(t + s) — X(t) must then 
be of the form ys, and in the specification of the process, it suffices to know in 
addition to u only the variance function v(s) = Var(Xt+s — Xz). Covariances 
do not need to be specified since for a process with stationary increments 


Cian k (olh) a Mer cree ie (7.1) 


A main motivation for using Gaussian processes in queueing is the fact that 
many of the standard queueing processes after appropriate scaling and center- 
ing converge to a Gaussian process with stationary increments. In risk the- 
ory, we have also already seen such an example in connection with the diffu- 
sion approximation for the Cramér-Lundberg process in V.5 where the limit is 
Brownian motion (BM). Another main limit process in queueing is fractional 
Brownian motion (f{Bm), a process BË with stationary increments, drift u = 0 
and v(s) = s?” for some H € (0,1) (the Hurst parameter; obviously, BM corre- 
sponds to H = 1/2). It arises in connection with so-called ON-OFF models where 
we have n i.i.d. sources. A source is either in an ON or an OFF state and feeds 
work into the system at rate 1 in ON periods. The time of an on-period (off- 
period) follows a distribution Fon and Fog, respectively, and all period lengths 
are assumed independent. Under stationarity, the total rate at which work is fed 
into the system is then nup where up = Mon/(Mon + Hof) and Hon, Hog are the 
means of Fon, For. Denote by X;,, the total work feed into the system before t. 
Because of the CLT, one intuitively expects that as n — oo, after appropriate 
scaling and centering, X;,n converges to a Gaussian process. Indeed this is true, 
but the limit will in general depend on Fon, For as well as on the ordering in 
which the scaling and the n — oo limits are taken. However, fBm arises as 
follows (Taqqu et al. [835], Whitt [883]): 


Theorem 7.1 Assume in the ON-OFF model that Fon, Fog are both regularly 
varying with indices Aon, or € (1,2) and that don # Qog. Then 


f : K tin TMP O H 
dm lim {og vn Bo ies hais co 
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where H = (3 — don V aog)/2 and the convergence is in D[0,00).? 


fBm is not itself particularly useful as a model for the netput process S 
(work fed in minus the work done by the server(s)) because it has mean zero 
and therefore cannot lead to stationarity. Instead, one typically assumes that 
St = Xı — ut where X; is Gaussian with stationary increments (fBm or some 
other process). The stationary distribution of the workload (the netput process 
reflected at 0) is then the same as that of sup;>ọ St; no time reversion is needed 
because stationary Gaussian processes are automatically time reversible. In 
particular, the probability that the stationary workload exceeds u is w(u) = 
P(r(u) < co) where as usual r(u) = inf{t: S > u}. Thus, we are back to a 
ruin problem. 

Ruin problems for Gaussian processes (or equivalently to say something on 
their maxima over infinite or finite time horizons) are notoriously difficult. We 
shall here concentrate on one approximation method, that of the largest term, 
which consists in approximating (u) by the tail 


(u + pt)? /2v(t)} dt 


| ame! 


of S+ at u for that t = t* for which the density is maximal. One uses Mill’s ratio 
to approximate the above tail by 


Dart 
V2rv(t)u G 


As a final approximation, one ignores the prefactor to the exponential, so that 
the largest term approximation becomes 


(u + ut)? /2v(t)}. 


plu) & max exp { —(u+ pt)? /2v(t)} = exp{- min(u + pt)? /2v(t)} 


= exp{—(u+pt*)?/2v(t*)}. (7.3) 


Example 7.2 Assume that S is standard Brownian motion, so that v(t) = t. 
The minimization problem is equivalent to minimizing 2 log(u+ yt) —log t, which 
by differentation gives 


~ Ut pt te u` 


Insertion in (7.3) gives y(u) ~ e~2#“ which we recognize as the exact value (cf. 
II.(2.5) with o = 1). 


2a(K) is a constant that does not need to concern us here. 
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Example 7.3 Assume, more generally, that S = By is fBm, so that v(t) = t”. 
Proceeding in the same way, we get 

2u 2H, u H 


= eE cep ee a, 
utpt ® t?’ ne pwl-H 


Insertion in (7.3) gives the approximation 


wo =o KGa) O 


Remark 7.4 Approximation (7.4) shows that in the fBm case, the largest term 
approximation for w(u) has a ‘Weibull-like’ decay with exponent r = 2 — 2H 
(for BM, we of course refind the exponential form). That is, the decay is slower 
the smaller r is (see also Section 1). This phenomenon can be explained from 
covariance properties of fBm. Indeed, the covariances between increments can 
be shown to be negative when H < 1/2 and positive when H > 1/2. Thus a 
period of increase is typically followed by one of decrease when H < 1/2. In 
other words, the increments compete to keep S low, whereas they collaborate 
when H > 1/2. A similar phenomenon exhibits itself in the path properties: 
fBm has smoother paths the smaller H is (cf. Figure XIII.4, which contains 
sample paths of fBm with H = 0.25,0.5,0.75 and 0.95). 
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FIGURE XIII.4 


A third example is integrability properties of the covariance function: the 
sum 


5 [S1 (Sn41 oe Sn) 


n=1 


432 CHAPTER XIII. FURTHER MODELS WITH DEPENDENCE 


converges for H < 1/2 but diverges for H > 1/2. Divergence of such sums 
or integrals is often referred to as long-range dependence, and the reason to 
focus on precisely this property is that certain CLT’s hold if and only if there 
is convergence. 


The exact asymptotics of y(u) is in fact known for [Bm (Piterbarg & Hiisler 
[487], Narayan [655)): 


Wa) ~ Guth HE) (EY) 


for some constant Cy (that is, there is a power-prefactor to (7.4)). The con- 
stant Cy, can, however, hardly said to be explicit since it involves the so-called 
Pickands constant, a quantity that shows up also in other aspects of Gaussian 
process theory and is basically unknown. 


Example 7.5 As a final example, we consider another popular model from 
queueing studies, an integrated Ornstein-Uhlenbeck process of the form X, = 
A Y, dv where Y is a stationary version of the Ornstein-Uhlenbeck process 
defined as the solution to dy; = —Y;dt + dB. Here one can check that 
v(t) =t+1-—e-*. The optimizer t* is not readily computed, but one can use 
large-u asymptotics: if u is large, then so of course is t*, and one has v(t) ~ t as 
t — co. Thus, for large u we expect t* to have the same asymptotic form as in 
Example 7.2 with Brownian motion, and we get the same approximation e~ 7+” 
as for that case. 


A main justification for the largest term approximation is that it is simple to 
compute, as seen from the examples. Another one is that it provides the correct 
logarithmic asymptotics: 


log 


w(u) ~ exp4 — (u + pt*)?/2v(t*)}, (7.6) 


see Debicki [280]. 

Finally it should be mentioned that the largest term approach also suggests 
that 7(u) is of order ¢*, thereby giving some information on the time horizon 
where ruin is most likely. Such information could be valuable, e.g., in a simula- 
tion study where attacking an infinite horizon is in general unfeasible and one 
could choose to simulate only up to time kt* for some suitably chosen k > 1. 


Notes and references The literature on ruin problems for Gaussian processes is 
huge. An accessible recent survey is in Mandjes [629]. Other main textbook references 
on aspects of Gaussian processes are Adler [6] and Rue & Held [756]. For convergence 
of stochastic processes, see Whitt [883] (motivating in many cases Gaussian models). 
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Apart from the largest term approach studied here, some trends in the literature 
on ruin problems for Gaussian processes are many-sources asymptotics (generalizing 
n — oo in the ON-OFF model) and the so-called double sum approach, see again [629]. 
Ruin-type problems for fBm and other self-similar processes were also investigated in 
Michna [637], Debicki, Michna & Rolski [282], Hiisler & Piterbarg [488] and Frangos, 
Vrontos & Yannacopoulos [369]. For asymptotic results on the time of ruin see e.g. 
Hiisler & Piterbarg [489]. 


8 Ordering of ruin probabilities 


We have already seen some ordering results on ruin probabilities in Section 
IV.8 and Section VII.4. Such ordering results can be very helpful, especially 
in situations where quantitative results for ruin probabilities under dependence 
are difficult to obtain. In this section, we collect a few further ordering results 
in connection with discrete-time models with dependent increments. 

Consider the discrete-time risk process Rn = u + Yii Xi, where X; is the 
net income of year n. Assume that the r.v.’s X; are dependent and light-tailed 
and that the assumptions of Theorem 1.2 are fulfilled. Then we still have an 
exponential decay of the ruin probability with adjustment coefficient y defined 
by k(y) = 0. In this set-up one can now compare streams of net incomes w.r.t. 
their resulting adjustment coefficient (recall the notion of convex ordering of 
Section IV.8). 


Proposition 8.1 Assume that Xı, X2,... and X, Xo, ... both fulfill the as- 
sumptions of Theorem 1.2. If Y ;—; Xi ~cx X; Xi for alln €N, then y > Ñ. 


Proof. The exponential function is convex, so we have Ee? £=: X be? Dis 
and subsequently «(0) < K(0) for all 0 € R, from which the ier Follows. 


If we now want to compare streams of net incomes with the same marginal 
distributions, but different dependence structure, the so-called supermodular 
order is a helpful concept. A function f : R” — R is a supermodular function, 
if for any x,y € R” 


FIY) SFE Ay) fE V y), 


where the operators ^A and V denote the component-wise minimum and maxi- 
mum, respectively (if f is twice differentiable, then supermodularity means that 
0? f /(Əx;ðx;) > 0 for all 1 < i < j < n). Two random vectors X, X are in 
supermodular order (X <sm X) if Ef(X) < Ef(X) for all supermodular func- 
tions f : R” — R. Since both functions y + I(y > x) and y > I(y < x) 


L 
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are supermodular for each fixed x, it is clear that if X <sm X, the marginal 
distributions of X and X have to coincide. 


Proposition 8.2 If X <sm X, then ye ae Ai X;. 

Proof. Simply note that Ef(X) < Ef (X) for all supermodular functions f : 
R” — R and in particular also for those supermodular functions that are non- 
decreasing and convex in each component. Let $(x) = 71 +--:+2%,. Then for 
every non-decreasing convex function h clearly g(x) = h(#(x)) is non-decreasing 


and convex. But the supermodularity implies E¢(X) = E¢(X) so that we ob- 
tain $(X) <cx (X). 


From Proposition 8.1 we thus get the following criterion: 


Corollary 8.3 Assume that X1,X2,... and Xa, Xa, ... both fulfill the assump- 
tions of Theorem 1.2. If (X1,...,Xn) ~sm (%1,.-.,Xn) for all n € N, then 
yy 

A random vector X = (X1, . . . , Xn) is called associated if Cov( f(X), g(X)) > 
0 for all non-decreasing functions f,g: R” — R. The following result is another 
indication that positive dependence among the risks in the insurance portfolio 
is dangerous. 
Proposition 8.4 Assume that (X1,...,Xn) is associated for all n € N and 
that Xis Lir aXe is a sequence of independent random variables with the same 
marginals. Then y < Ñ. 


Proof. By Proposition 8.2 it suffices to show that 7, Xi xex Xa] Xi, and 
since E( X; X:) = E(X; Xi), it even suffices to show that 


We proceed inductively. For n = 1, this statement is clearly fulfilled. Assume 
now that (8.1) holds. Since the <;,,-order is closed under convolution, we then 
have pyar, Xi Kicx Èb- Xi +Xn+1ı. Choosing appropriate indicator functions 
in the definition of association, it is clear that 


P(5 x: < v1 )P(Xnt1 < 22) = P(5 x: < £1, Xn+1 S xa) 
j=l j=l 


n 


< (D < 1, Xn41 < £a) 


i=l 
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for all 21,22 > 0. But in view of the stop-loss order interpretation IV.(8.1) of 
the <;-,-order, it then follows from the general representation 


d 
(Zi + Z2 — d)” = (Z1) + (Za) —a+ | P(Zı < xz, Z2 < d — x) dz 
0 


that X; Xo Xn+1 Kicx ear X;. The assertion now follows from the tran- 
sitivity of the ~j,,-order. 


This result is often useful, as in many situations association can be shown 
by a combination of the following properties: 


e If (X1,...,X») are independent, then the vector X = (X1,...,Xn) is 
associated. 


e If X is associated and fi,..., fn : R — R are non-decreasing functions, 
then (f1(X1),-.., fn(Xn)) is associated. 


e If X is associated and fi,..., fg : R” — R are non-decreasing functions, 
then (f1(X),.-.., fe(X)) is also associated. 


Notes and references A general reference for stochastic orderings in the context 
of actuarial science is Denuit et al. [292]. Some ordering results of the adjustment 
coefficient under dependence can be found in Miiller & Pflug [652]; see also Frostig 
[375]. Stochastic orderings for random sums have (in view of the Pollaczeck-Khinchine 
formula) also implications on ruin probabilities. Such results for a given ordering of 
the involved claim number r.v. are given by Denuit, Genest & Marceau [294]; for de- 
pendence between the number of claims and their individual distribution, see Belzunce 
et al. [155]. 


9 Multi-dimensional risk processes 


Assume now that we have n possibly dependent portfolios (or lines of business) 
described through the vector R; = (Rł,..., R?) of risk reserve processes with 
initial capital vector u = (u1,...,U,) and one Poisson process N; with intensity 
B that generates a claim in each of the components represented through the 
claim vector U; = (Uj1,...,Uin). With a premium intensity vector p, the 
multivariate risk reserve process is given by 


R,=u+tp— J Ui, t>0. (9.1) 
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Here U),U2,... is a sequence of i.i.d. random vectors with joint distribution 
function B(z1,..., £n), joint m.g.f. Biry,...,rn] =E [exp(r1Ui1 + ‘+rnUin)| 
and marginal distributions By(21),...,Bn(a») (so in general the components 


of the claim vector U; may be dependent). It is easy to think of a number of 
situations where such a model applies, namely that one event or accident causes 
a claim in several lines of business or several portfolios. For such a risk process, 
there are now several ways to define the event of ruin and it will depend on the 
situation which one is appropriate. Let Tmax be the first time when all of the 
components are negative, i.e. 


Tmax(u) = inf{t>0|R, <0} = inf{t>0| max{R},..., RP} <0}, 


where inequalities for vectors are meant component-wise. The corresponding 
finite-time ruin probability is 


Ymax(u,T) = P(tmax(u) < T) 
and the infinite-time ruin probability is 
Wmax(t) = P(Tmaxlu) < 00). 
Other types of ruin times are 
Tminlu) = inf {t > 0| min(Rį,..., RP) <0} 


and 
Tsum(u) = inf{t > 0| Ri +... + R? <0}. 


Remark 9.1 Obviously the ruin probability Ysum(u) = P(tsum(u) < o0) re- 
duces the problem again to a univariate problem with u = u1 + :-- + Un and 
i.i.d. claims U; = Uir +--+ + Uin (so each U; is a sum of n dependent r.v.). 
In this case the multivariate framework is then just the model set-up to spec- 
ify the dependence that determines the distribution of U; and through that 
the ruin probability. In particular, one can now ask how dependence influ- 
ences Wsum(t) either by quantifying the dependence structure or by studying 
stochastic ordering. In the Notes some references to corresponding work in the 
literature are given (in particular for more general multivariate point processes). 
In the remainder of this section we focus however on ruin definitions that leave 
the problem in a ‘truly’ multivariate setting. 


The following martingale is an extension of the Wald martingale of the uni- 
variate case. 


3For the distribution of dependent sums, see Section XVI.2d. 


9. MULTI-DIMENSIONAL RISK PROCESSES 437 


Lemma 9.2 Let 1r1,...,7%m E R be such that Bir, +5 Tn| < oo. Define 


Giese tn) =F BB ris 22.4%] B Pırı cee Pnn. 


Then 


M, = exp{ rR) —...— Tra R? — tk(ri,... Tn) Y; t>0 
is a martingale w.r.t. the natural filtration F. 


Proof. Since N; is a homogeneous Poisson process, we get for all t, h > 0 


fen {- Soru(Rys a 


i=1 


From this it follows that 


rl y 
i| exp { — fr Resh ip Riep (CHRR nd | Fal 
= exp { rR} 1. Tn RY tea teva) bs 


Let us conde the situation of light-tailed marginal claim size distribu- 
tions and define r? = sup {r; | Bl Blo,...,0,7;,0,...,0] < oo} as the abscissa 
of convergence of the m.g.f. of tlie marginal r.v. U. Define further the 
sets G = {(ri,...,t%) € R®|Blri,...,rn] < co}, G2 = GN (0,00)” and 
A = {(r1,..-, rn) E G| hie ta =U A} = A (0, œ)”. 

Let u be the vector containing the expected values of the marginal claim 
size distributions. 


Proposition 9.3 Assume that the component-wise net profit condition Bu < p 
holds. If r}? > 0 for alli=1,...,n and SUP (r,...,7n JEG? K(T1,---;Tn) > 0, then 


pases 


Wmax(U) < inf @ US hte 


An example of the shape of the set A for dimension n = 2 is illustrated in 
Fig. XIII.5 (the arrows are unimportant for the moment but will show up below 
in Remark 9.5). 
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Figure XIII.5: The set A 


Proof of Proposition 9.3. Due to Lemma 9.2, M is a martingale and Tmax(u) is 
a stopping time. For every (r1,...,7n) with Blri,...,7n] < oo we know from 
Lemma 9.2 that 


eT Viziti 11M: ] = “| M;; Tmax(ts) < t] 
= E[M,,...(u) | tinax(t) < t] P(tinax(t) < t) 


For all (r1,...,;7n) € G° 
e7 Xi ri R maxu) > 1, 


which thus leads to 


P(Tmaxlu) < t) < eT ein MMs sup ebi ewes Tn), 


It is easy to check by taking partial derivatives that along every ray from 0 
into (0,00)”, k(r1,-..,fn) is a continuous and convex function that (with the 
positive safety loading for each component) has negative derivative in 0 and by 
K(0,...,0) = 0 and continuity will hence satisfy K(r1,...,7n) = 0 for at least 
one (r1,--.,7n) € G®, ie. A? is not empty. Hence we can write 


P(Tmax(tt) < t) < inf — eT Xiz riu, 


Letting t — oo then gives the result. 
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Example 9.4 A particular boundary case of this multivariate risk model with 
practical relevance of its own is the two-dimensional model 


R a 7 : ee 3) >0 
Ca (a e 

Here the dependence between the two claim components is the strongest possi- 
ble, namely comonotonic dependence. The interpretation is that the (univariate) 
claims U; are proportionally shared by two portfolios who may have different 
premium intensities pı, p2 (reflecting different safety loadings) and the ques- 
tion is for instance how to allocate initial capital u, and ug in a sensible way 
so as to minimize the ruin probability Umin. One immediately observes that 
Tmin(tu) can also be represented as Tmin(u) = inf{t > 0| pees U; > q(t)} with 
q(t) = min{ (u1 + pit)/a, (u2 + pet)/(1 — a)}. So in fact this two-dimensional 
model can be treated as a one-dimensional crossing problem of a compound 
Poisson process over a piecewise linear barrier. If pı/a > p2/(1 — a) and 
u,/a > U2/(1 — a), then the barrier is linear and one bounces back to the 
classical risk model. Extensions of this capital allocation problem to higher 
dimensions and more general claim arrival processes are obvious. 


Remark 9.5 In Collamore [250, 251] a related multi-dimensional ruin problem 
is considered, namely to estimate the probability that a random walk {S,,} in 
IR? hits a rare set. More precisely, we will assume that the rare set has the form 
zA = {xa: a€ A}, where A is convex and x is a large parameter, and that 
the random walk in itself would typically avoid xA. For this, the drift vector 
p = ES; should as a minimum satisfy tu ¢ xA for all t and x (technically, the 
existence of a separating hyperplane is sufficient). For simplicity, we take d = 2. 
Define r(x) = inf {n: Sn E€ xA} and z(x) = P(r(x) < ow). 

The situation is as in Figure XIII.6. Here x(k) is the point at which the 
line with direction k hits xA. The mean drift vector u points away from A, so 
an obvious possibility is to use an exponential change of measure changing the 
drift to some k pointing towards A. 

Exponential change of measure is defined along similar lines as for the mul- 
tidimensional continuous-time risk process. We let 


K(0) = (01,02) = log Bo 42%: 


where (X1, X2) = Sı. The exponentially tilted measure is a random walk with 
increment distribution satisfying 


%0, 02A X1, X2) = [h(X1, Xə) exp {01 Xı + 2X2 = K(41, 02) $] . (9.2) 
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tA 


H 
Figure XIII.6: The ruin set 7A 


It easily follows that the changed drift under Po, 9, is given by 


Ho, os = Eo,,0,(X1,X2) = (1,42) = Va(61, 02), (9.3) 


where V denotes the gradient. In Fig. XIII.5, the arrows pointing outward from 
A are the gradients. The gradient is orthogonal to A and at any given point, 
its length is twice the radius of curvature at the given point of A. 

Thus, we face the problem of which (row-vector) y € A to work with. A 
lower bound for z(x) is given by the probability of the path following the Py- 
description, i.e. by 


2(r) = P(r() <o) = Eye TS w eY). (9.4) 


This suggests to take y = -y* where 


qy” = argmin y: (u). 

It can be shown that (under appropriate conditions) indeed the correct log- 
arithmic asymptotics corresponds to taking y = -y* in (9.4) and that the cor- 
responding exponential change of measure with 0 = y* leads to logarithmic 
efficiency; see Collamore [251]. 


If the claim components are heavy-tailed, the general picture is less complete. 
Here is a simple result on asymptotic finite-time ruin probabilities for a risk 
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process of the type (9.1) with independent components, but with a possibly 
more general claim number process: 


Proposition 9.6 Assume that B(x1,...,%n) = []j_, Bi(wi) and Bi € Z. As- 
sume further that EzN? < 00 for some z > 1, where Nr is the number of claims 
up to time T. Then for fixed T > 0 


Ymax(u,T) ~ E[(Nr)"] [][ Bi(us), wu oo. (9.5) 
t=1 
Proof. Since 


Ni 
Umax(t, T) =P(SU, —tp>vu for some0<t< T) , 


i=1 
we have the simple upper bound 
Nr foe) n m 
Umax(u, T) < eS U; > u) = 5 P(Nr = m) II P(> Ui; > uj) r 
i=1 m=0 j=l i=l 


By the subexponential property of the marginals, Lemma X.1.8, Lemma X.2.2 
and dominated convergence, the latter is asymptotically equal to 


XO P(Nr =m) m” |] Bitu) = E[(Nr)”] [3:0 (9.6) 


so that we have the upper bound 


Ymaxlu, T) < (1+ 0(1)) E[(Nr)”] J[B:. 


Similarly, a lower bound for the finite-time ruin probability is 


m 


Nr oo n 
Ymaxlu, T) > P(X Ui- Tp>u) = P Pr =m [P(X Vs > wy +27). 
i=1 m=0 j=1 


i=l 


By the long-tailed property of the subexponential distribution, this is asymp- 
totically also equal to (9.6) so that 


Umax(u, T) 2 (1 + 0(1)) | (Nr)”] BEZO 


=r 


and we have asymptotic equivalence. 
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Remark 9.7 Note that by Jensen’s inequality E|(Nr)”] > E[N7]”, so that 
Umax(u, T) is asymptotically larger than the product of the n marginal one- 
dimensional finite-time ruin probabilities, which can be explained by the com- 


mon claim number process that governs the n components. 


Notes and references Although multivariate ruin theory is a very natural exten- 
sion of classical ruin theory with a lot of potential applications also in fields outside 
of insurance (such as credit risk or barrier option pricing), this research field is not 
yet very far developed. As in Remark 9.5, the event of ruin can in general be defined 
as the first passage of R+ into an n-dimensional open set A that does not contain 
0. An early paper in such a framework is Dembo, Karlin & Zeitouni [289] for mul- 
tivariate Lévy processes. In the framework of Remark 9.5, Collamore [250, 251] also 
derived asymptotic results for the time of ruin; see in addition Borovkov & Mogul’skii 
[186, 187, 188]. Huh & Kolkiewicz [483] deal with ruin probabilities for multivariate 
diffusions and applications to the pricing of credit risk products. 

The particular risk model (9.1) was investigated for two dimensions in Chan, Yang 
& Zhang [230]. The martingale approach was also implemented in Li, Liu & Tang [583] 
who worked out some concrete examples of dependence structures for two dimensions 
and illustrated that it is easily possible to extend Proposition 9.3 for the situation where 
a Brownian perturbation is added in each component of (9.1) with a joint correlation 
matrix (the form of « is then correspondingly modified). Finite-time ruin probability 
approximations via a bivariate compound binomial model for two dimensions as well 
as some ordering results are given in Yuen, Guo & Wu [910] and Cai & Li [219]. In the 
latter paper also an explicit solution for Wsum(w) for multivariate phase-type claims is 
derived, see also Eisele [341] for a Panjer-type recursion and Sundt & Vernic [824] for 
a more general treatment. 

Explicit results for Ymax(u) and Ymin(u) are usually out of reach (except for 
very simple situations, see e.g. Dang et al. [272]); however, it was shown in Cai 
& Li [219] that if the claim vectors are associated, then Į] [;_; Wi(ui) < Wmaz(u) < 
mini <i<n Wi(ui), where w;(u;) is the marginal ruin probability of the ith component. 
From this it is not hard to establish in two dimensions the bound 


max{yı (u1), p2(u2)} < Ymin(us,u2) < wi(ur) + Y2(u2) — Yı (u1)p2 (u2). 


The model of Example 9.4 with N; being a renewal process is studied by Avram, 
Palmowski & Pistorius [111]. In particular, asymptotic results for light-tailed claim 
size distributions are derived for two dimensions. Related two-sided barrier crossing 
problems for compound Poisson processes and analogies to queueing problems are 
studied in Perry, Stadje & Zaks [693]. 

If U; in (9.1) is multivariate regularly varying, quite explicit and intuitive asymp- 
totic results can be obtained for renewal claim number processes, see Hult & Lindskog 
[485] with a slightly different definition of ruin. 

As mentioned in Remark 9.1, if the claim sizes in each business line and between 
business lines (i.e. components) are i.i.d. and the aim is to assess Wsum, then it is often 
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possible to transform more complicated multivariate point processes into simpler ones. 
One example is that Nz is a superposition of counting processes each of which causes 
claims only in a selection of components, in which case one can usually identify a one- 
dimensional reformulation of the model with a modified (mixed) claim distribution 
(see e.g. Yuen, Guo & Wu [909], Ambagaspitiya [46] and for stochastic orderings 
Frostig [375] and Lindskog & McNeil [599]). Pfeifer & NeSlehova [696] and Bauerle 
& Griibel [143] use copulas and random time shifts to generate multivariate claim 
counting distributions with Poisson marginals; for another general flexible multivariate 
counting process, see Bauerle & Griibel [144]. 

Ruin probabilities of the type Ysum and Lundberg bounds in a discrete multivariate 
autoregressive model are investigated in Zhang, Yuen & Li [916]. Since e” Xi=1 i jg 
a supermodular function, it is immediately clear that if U ~sm U in (9.1), then the 
respective adjustment coefficients for Wsum (if they exist) fulfill Y < y. Extensions of 
this result to Cox models and finite-time Lundberg inequalities are given in Juri [513], 
see also Macci, Stabile & Torrisi [620]. 

Bregman & Kliippelberg [198] show that if two compound Poisson processes are 
coupled by a Clayton Lévy copula, one can obtain quite explicit asymptotic results 
for Wsum. For stochastic ordering results of component-wise ruin times and Wsum in a 
general multivariate set-up with Lévy copulas see Bauerle, Blatter & Müller [142]. 

Another multivariate aspect is competing claim processes. That is, the reserve 
process is R, = Ri +--- + RP, and if ruin occurs, one may ask which of the k 
components caused it. In particular, if all of the R can only go downwards by a jump, 
it is a well-defined question which R’ actually performed the jump to make the reserve 
go negative. However, especially in the light-tailed case this does not tell the whole 
story: some other RÍ may have taken R close to zero before ruin. For work in this 
direction, see Huzak et al. [491]. 
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Chapter XIV 


Stochastic control 


1 Introduction 


The purpose of stochastic control is to find strategies that are optimal in the 
sense of maximizing a suitably defined reward function. In the setting of this 
book, consider the risk reserve process. Time may be discrete or continuous, 
and the time horizon finite (deterministic or a stopping time) or infinite, and 
we will denote by T its upper limit. Assume given a set of possible actions. At 
each time t the controller then chooses one particular action u;, and the function 
U = (w)i<r is denoted a strategy, the set of admissible strategies over which 
to maximize is Y, and the reserve process governed by a particular strategy is 
{RY } (for notational convenience {R,} when it is clear what U is). We further 
assume that the reserve has a given initial value 2 = x.! 
In discrete time, the reward to be maximized over U will have the form 


T 
VE (x) = Es X r(RY uz, t). (1.1) 
t=0 


Thus r(z,u,t) is the gain of using strategy u at reserve level x at time t. The 
function r may be negative, which corresponds to a loss, not a gain. It is 
common to assume that t only enters via a discounting factor ô, i.e. (with a slight 
abuse of notation) that r(x,u,t) = e~*'r(a,u), but we will not always make 
this assumption. It is, however, convenient in continuous time problems, and 
in infinite-horizon problems one certainly needs r(x, u,t) to somehow decrease 
with ¢ since otherwise the total V¥ (x) may well be infinite. 


1Note that u is used in most of the rest of the book. We use a different symbol here to 
avoid confusion with the control. 
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The value Vr(a) of holding initial reserve x is obtained by maximizing over 
the set Y of strategies under consideration, i.e. 


Vr(x) = sup VÆ (zx). (1.2) 
UEU 


The supremum may or may not be attained. When it is attained, the maximizer 
is denoted by U*. The function x — Vr(x) is denoted the value function. 
In continuous time, the reward will have the form 


T 
VE (z) = (Ri) + f r(RY us, t) dt. (1.3) 


The added term rr(RẸ) corresponds to a terminal reward (or punishment). The 
value function is again defined by (1.2). 


Example 1.1 Consider a risk process in discrete time, such that the amount 
of premiums received at each time instant t = 1,2,...is 1 and the claim amont 
for the time period (t — 1,t] is a r.v. Y;, such that the Y; are i.i.d. and satisfy 
LY, < 1. Thus, without any form of control we have 


Ro = x, Ry =x+1-Y, Rə = Riı+1- Y2... (1.4) 


and the time horizon T is the time 7 of ruin when 7 < 00, œ otherwise. As an 
example of a control problem, consider dynamic proportional reinsurance and 
minimization of the ruin probability. That is, at time t the company chooses 
to reinsure a proportion us € [0,1] of its portfolio so that Y, is replaced by 
(1 — u,)Y;. If reinsurance is cheap, i.e. the premium income changes in the 
same proportion (replace 1 by 1 — uz, in (1.4)), this reduces variability and so 
potentially the ruin probability. However, in practice reinsurance is not cheap: 
the premium b(u+) to pay for reinsurance will typically satisfy b(u) > u so that 
the drift is reduced which potentially increases the ruin probability and there is 
a trade-off. Note that Y= [0,1]. 

The problem can be put into the framework (1.1) by taking T = œ, [0,00)U 
{A} as state space, modifying R by letting R, = A when 7 < t < œo (r is the 
time of ruin) and taking the reward function as r(a,u,t) = —1 for x < 0 and 0 
otherwise. Then the sum in (1.1) is —1 when T < œ and 0 otherwise, and so 
V(x) is minus the probability of ruin under strategy U. 


In this chapter we will only work with feedback strategies. This means that 
we assume RY to be (possibly time-inhomogeneous) Markov with transition 
mechanism at time t depending on u, and that us is chosen as a function of RY 
and t only. 
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Notes and references In the Markovian setting it may certainly seem counter- 
intuitive that a u¢ depending on some further characteristics of {R}, <, Could be 
optimal. However, the problem is more complicated than it may look. For some 
discussion in discrete time, see Blackwell [171] and Bertsekas & Shreve [160]. In con- 
tinuous time, one typically first finds a feedback strategy that is a candidate for being 
optimal and subsequently gives a proof of the optimality by a so-called verification 
theorem. 


2 Stochastic dynamic programming 


Stochastic dynamic programming is a method for solving the optimization prob- 
lem (1.2) in discrete time. The model then means that RY = (RY,..., RẸ) is a 
Markov chain whose transition probabilities p(t, x, y, u) from t to t+ 1 and from 
state x to state y depend on u = w. 

The idea is most readily explained by first assuming RY to have a finite state 
space E, the set of all controls u to be finite, and T < co to be deterministic. 
We then define the value function V(t, x) at time t < T and in state x as 


T 
V(t,x) = max E, r(RY' ut 231 
(2) = max Ee DO r(R met (2.1) 
where %* is the set of all admissible U* = (uz, . . . , ur), and we denote by u* (t, x) 


a control which is optimal (it does not necessarily have to be unique). Clearly, 
the strategy U* given by uo = u (0, zo), u1 = u*(1, Rı),... ur = u*(T, Rr) is 
then optimal. 

To compute u*(t, x), one proceeds backward in time. At t = T, it is obvious 
that u*(T,x) = argmax,r(x,u,T) and that V(T,2) = r(x, u*(x),T). 

Assume the values V(t + 1,y) have been computed for all y € E. Then 
clearly 

Vi(t,z) = max oS p(t, xz,y,u)V(t+1,y), (2.2) 
yEE 


and u*(t, x) is a maximizer. 
Note that in this setting where everything is finite, we in principle have a 
finite maximization problem: 


T 
V(0,“) = max ie rt (RY, ut, t), (2.3) 
uo UT t20 
where U = (uo,...,ur). The advantage of stochastic dynamic programming 


is to reduce complexity. Say we have p Markov states and q possible controls. 


448 CHAPTER XIV. STOCHASTIC CONTROL 


Using (2.3), the expectation is then a sum over all pT possible Markov chain 
paths, where each sum contains T + 1 terms, and this has to be evaluated 
over all q7+! possible control combinations. Thus the total complexity is (T + 
1)p7q7*!. In contrast, each sum in (2.2) has p terms and has to be evaluated 
for q controls u and p Markov states x. Thus the complexity of each backward 
step is p?q and the total complexity of the stochastic dynamic programming 
algorithm is Tp?q + pq (the second term comes from the initial t = T step). 
This is typically an enormous saving over (T + 1)p7q7*!, not least in situations 
that are a discretization of a problem containing continuous components so that 
p and/or q and/or T may be huge. 

Beyond the finite setting just considered, one can in principle set up a rather 
similar scheme, but a considerable amount of difficulties will typically arise. If Æ 
and/or the set of controls is countable or continuous, it may be difficult just to 
compute E[Ri41 | Re = x, us = u] and also to find the maximizer in closed form 
(these two steps are the analogues of (2.2)). Even more serious problems arise 
when there is no upper bound on T because the initial step is then unfeasible. 
Finally, the supremum in (1.2) may not be attained. 

A warning should be issued that the obvious idea of discretizing and truncat- 
ing requires continuity properties that (maybe as a surprise!) need not always 
hold even in non-artificial settings. For example, if T = oo or T is a stopping 
time, replacing the time horizon by T An, using the above approach going back 
from T An and finally letting n — oo to get the T = œ optimal strategy as 
limit of the T A n optimal strategies does not always work, see Schmidli [779, 


pp.6-8]. 


Notes and references For a list of standard texts in stochastic dynamic pro- 
gramming, see Schmidli [779, p.8]. A ruin probability problem treated by stochastic 
dynamic programming is in Schal [765]. 


3 The Hamilton-Jacobi-Bellman equation 


We now turn to the continuous time setting. The approach will be similar to 
the dynamic programming one, but in continuous time we go back to t from 
t+ dt rather than from t+ 1. 

We will state two simplifying assumptions which, however, will be sufficient 
to cover the ruin probability applications we have in mind. One is r(az,u,t) = 
e'r(x,u), the other that T is an exit time (for example, the time of ruin). 
This ensures that the value function V(t, x) as defined by 


T 
V(t,z) = sup cate (Tt) nn (RY) + J e 96-9 r(RY us)ds| (8.1) 
UteKt t 
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(where Ez denotes E|- | R: = x, t < T]) only depends on T — t, not on t itself. 
One is then eventually interested in V(x) = V(0, 2). 

Generators (cf. Chapter IT) turn out to be an essential tool. Denote by æ” 
the generator of the (time-homogeneous) Markov process according to which 
RY evolves when control u; = u is used. Then: 


Theorem 3.1 Under certain assumptions, the value function V(-) is the solu- 
tion of 


0 = sup [@"V (x) — V(x) + r(x, u)] . (3.2) 


Remark 3.2 Equation (3.2) goes under the name Hamilton-Jacobi-Bellman 
(HJB) equation. Its derivation typically requires some assumptions that are dif- 
ficult to verify directly. For instance, suitable regularity conditions are needed, 
in particular that V(-) is in the domain of æ” for all u (we do not specify the 
remaining ones and the reader should not take the proof below for more than 
a heuristic justification). The argument suggests that the maximizer u* (when 
it exists) is the optimal control when R, = x. However, to establish that the 
solution of (3.2) indeed solves the control problem, more work is needed (in that 
sense, the custom to use V in the formulation of the HJB equation is a slight 
abuse of notation). 

Another complication is that it is not a priori clear whether the HJB equation 
has a unique solution. If it has, then one usually needs to prove separately 
(in a so-called verification step) that the obtained solution is indeed the value 
function of the optimal control problem. This can be done by either justifying 
all steps of the derivation of the HJB derivation rigorously, or by proving that 
the solution of the HJB equation dominates all other possible value functions 
(such a procedure often involves martingale arguments and (extensions of) It6’s 
formula). The second possibility is usually the more feasible one. 

If the solution of the HJB equation is not unique (which may for instance 
happen if the initial condition cannot be specified), then the stochastic control 
problem can become very difficult. This effect can for instance occur if the value 
function is not as regular as the HJB equation would ask for. In that case one 
can often still work with either weak solutions or so-called viscosity solutions, 
see the references at the end of the chapter. 


Proof of Theorem 3.1. Let u be an arbitrary control and assume that u is used 
as control in [t,t+ h) and the optimal control is used in [t+ h,T). Then V(x) 
has two parts, the contribution from [t,t + h) and the one from [t+h,T). This 
gives 


V(x) 


IV 


r(z,u)h + 0(h) +e" E, V (Rin) 
= r(x, u)h — OAV(x) + H°V(ax)h + V(x) + olh), 
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which shows that (3.2) holds with = replaced by >. To see that the sup is 
actually 0, choose a control u such that the above scheme gives an expected 
reward of at least V(x) — e. The same calculation then gives 


V(x)—e < M"V(x) + r(x,u) + V(a) — V(x). 


Let e | 0. 


Remark 3.3 When the value function is determined, the next step is to identify 
the corresponding control strategy that realizes this value function (this is not 
always simple and it may even happen that such a strategy does not exist!). In 
any case, by definition at least ¢-optimal strategies always exist, i.e. for each 
e > 0 there is a strategy that leads to a value of V(x) — e. 


As may become clear from the above remarks, giving a rigorous and system- 
atic treatment of stochastic control theory in insurance is outside the scope of 
this book. In the sequel we shall rather consider a few particular examples to 
get the flavor of the topic. 


Example 3.4 (OPTIMAL INVESTMENT FOR A DIFFUSION) As a first example, 
we consider the investment-ruin problem of Browne [206]. The model is given 
by two stochastic differential equations 


dR? = adt + a2dB?, dM, = bi M,dt + boM; dB}, 


where B°, B} are independent standard Brownian motions and R} = x. Here 
R? describes the evolution of the reserve of the company without investment 
and M the price process of a risky asset. Thus, R° is Brownian motion with 
drift and M is geometric Brownian motion. It is now assumed that the company 
is free to invest an amount ur in the risky asset? at any time t so that in the 
presence of investment, the reserve evolves according to 


dR, = ay dt + a2 dB? + ubi dt + utba dB} à 


The purpose is to minimize the infinite horizon ruin probability or, equivalently, 
to maximize the survival probability ¢(u) = 1 — y(u). Thus in the general 
formulation we may take T as the ruin time (which is a stopping time), r = 0, 
ô = 0 and rr = —1. Adding the constant 1 to the value function, V indeed 


2Here and in the sequel uz can exceed the present surplus level x, i.e. it is possible to use 
additional sources (or borrow money) for the investment. See the Notes for references which 
deal with constraints on uz such as ut < x. 
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corresponds to the survival probability of the controlled process. Consequently 
the HJB equation is simply 0 = sup, #"“V (x), i.e. 


1 

0 = sup |(aı + ub) V(x) + 5 (2 + u?b2) V” (x)|. (3.3) 
u>0 

For the solution, we first note that (since V is increasing and hence V’(x) > 0), 

a maximizer u* can only exist when V” < 0. We then simply compute u* by 

differentiating WV (x) w.r.t. u to get b,V’(x) + u*b3V""(x) = 0, i.e. 


eo _bh V) 
B Va) 
Substituting back in the HJB equation gives the ODE 
bi V'(x) 1 2 abi Via es) 


0 = a, V(x) 


1 
v’ H y” =b5 — SV f 
b2 V” (x) (x) 992 (x) T 2 p4 V” (a)? (x) 
Dividing by V’(x) and collecting terms shows that z(x) = V'(x)/V” (x) must 
be the solution of > 

1,1 1% 

a z(z) 263 a; 

Multiplying by z(x) gives a quadratic equation, and since we assumed V” < 0, 
z(x) must be the negative solution, say k. In particular z(x) and hence u* does 
not depend on z, and we get our final solution u* = —kb;/03. 

The (somewhat surprising) conclusion is that, no matter how large the cur- 
rent capital æ, it is optimal to always invest the constant amount —kb?/b3 of 
money into the risky asset for minimizing the probability of ruin. The resulting 
minimal ruin probability can be calculated by substituting u* into (3.3) and 
using the boundary conditions V(0) = 0 (with diffusion, starting in zero leads 
to immediate ruin) and V(co) = 1. This results in 
2(aı + biu*) 2(aı — b?k/b2) 


=1-V(2)=e-% with y= = 
Wrta) as we ES +ou)? a2 + b2k?/b2 


0 = a4 


The example shows a common feature of control problems in diffusion mod- 
els, that the HJB equation often takes a form of a non-standard ODE. Here it 
was rather easily solvable, but in general much ingenuity may be required. It 
should also be stressed that the calculations do not provide a rigorous proof that 
the candidate for the optimal strategy that was found indeed is optimal. For this 
one needs a verification step, that for diffusion models is often done by checking 
that the solution to the HJB equation that was found is twice differentiable (as 
here). Optimality then follows by a simple application of Itô’s formula. 
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Example 3.5 (OPTIMAL PROPORTIONAL REINSURANCE FOR A DIFFUSION) 
Similar techniques as in Example 3.4 can be used to treat optimal proportional 
reinsurance for a diffusion that evolves according to 


dR? = adt + agdB?. 


ILe., at each time ¢ there is the possibility to pass on a fraction 1 — us € [0,1] 
of the risk (which in the diffusion approximation is represented by the second 
part above), at the expense of a reduced drift (due to the subtraction of the 
reinsurance premium drift ag). Correspondingly, in the presence of proportional 
reinsurance, the process follows the dynamics 


dRY = (ua + (1 — uz) (a1 — ag)) dt + az ud B? 
= (uao + (a1 — ag)) dt + az ud BP. 


We can restrict to the case ag > a; (otherwise the ruin probability will trivially 
be minimized (namely be equal to 0) by uz = 0 for all t). We want to maximize 
the probability of survival ¢(u), so as in Example 3.4 we choose T as the ruin 
time, r = 0, 6 = 0 and rr = —1 (and add the constant 1 to the value function) 
to arrive at the HJB equation 0 = sup,s,5 Z” V (x), i.e. 


uaa 
0 = sup (a — ao + wag) V(x) + — V” (x)|. 
uc[0,1] 2 


It can be solved in much the same way as in Example 3.4 (one just has to 
additionally take care of the bound u € [0, 1]) and one obtains that the optimal 
strategy is to have a constant fraction of proportional reinsurance given by 
u* = min{2(1—a/ag), 1}. If this value is now substituted in the HJB equation, 
its solution (for the boundary conditions V(0) = 0 and V(co) = 1) gives the 
resulting minimal ruin probability 


pra) =1- V(x) = ] — eR? 


with 
_ f a3/(2a3(ag—a1)) ifai < ap < 2a, 
a 2a; /a3 if ag > 2aı. 


One finally needs a verification theorem showing that the obtained form of V (x) 
indeed dominates all other admissible strategies which again can be done by ap- 
plying Itô’s formula. 


Stochastic control problems for jump processes turn out to be more subtle: 
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Example 3.6 (OPTIMAL INVESTMENT FOR THE CRAMER-LUNDBERG MODEL) 
Let us now consider a Cramér-Lundberg risk reserve process R? = x +t — A; 
(where {A;} is a compound Poisson process with rate 6 and individual claim 
distribution function B) and the possibility to dynamically invest an amount of 
uz into a financial asset that is modeled by geometric Brownian motion M; with 


dM, = bıMıdt + b2M;dB:. 
The controlled process then satisfies 
dRY = (1+ usbi)dt + urbo dB, — dA. 


The goal is again to minimize the ruin probability of {RY } over all admissible 
strategies u; which are assumed to be predictable (in particular, the value of an 
admissible strategy at time t may depend on the history of the process up to t, 
but not on the size of a claim that may occur at t). In the present model, the 
HJB equation 0 = sup,s) “V (x) translates into 


u>0 


a 


0 = sup (thw) Ve) + Seva ea f ve- y)B(dy) — vie))] 
3. 
) 


( 
Note that u*(x) — 0 as x — 0 (otherwise the investment will lead to (0 
V(0) = 0 which cannot be optimal), so we obtain a boundary condition V’(0) = 
BV (0). The second boundary condition is again limy_... V (£) = 1. 
Since there is no solution to this equation for V” (x) > 0, we assume V” (x) < 
0. Then the supremum is attained for 


bi V’ (2) 


"e= ~ RVs) 


and plugging this into the HJB equation one gets 


Vis) Tei + (f AET B(dy) = V(«)) zy: (3.5) 


It is now considerably more difficult than in the diffusion case to solve this 
equation and retrieve further information about the optimal strategy. In a first 
step one can show in a verification theorem using Itô’s lemma that if (3.4) has an 
increasing twice continuously differentiable solution, then the feedback strategy 
u* is indeed optimal among all admissible investment strategies. Under further 
assumptions on B (like the existence of a bounded density b(x)), one can then 
show with considerable effort that indeed a unique increasing twice continuously 
differentiable solution of (3.4) exists; however its form can only be determined 
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numerically. Remarkably, one can still retrieve substantial information about 
the asymptotic behavior of both u*(x) and %zr(x) = 1 — V(x) as z > co: For 
light-tailed claim size distribution B, if the adjustment coefficient yz exists as 
the positive solution of 


2 
a(Bi)-1)-r = 34, (3.6) 
2b5 
then Yr(x) < e77 and (under a mild additional condition) the Cramér-Lund- 
berg approximation lim; 4. €% w(x) = C holds for some constant C. Without 
investment, the r.h.s. of (3.6) is zero, so clearly y; > y, and hence optimal 
investment can substantially decrease the probability of ruin. Furthermore 
lim, oo u*(x) = b,/(b3y7), so asymptotically the optimal strategy is to invest 
a constant amount into the risky asset, which is somewhat surprising at first 
sight. n 
On the other hand, for heavy-tailed B (i.e. B[r] = 00 for all r > 0), one 
can show that u*(x) is unbounded. If the failure rate of B tends to zero, then 
the optimal strategy converges and lim,_,.. u*(x) = co. A quite pleasant result 
(whose proof is beyond the scope of this book) is that for B, By € Z the optimal 
investment strategy leads to 


2803 se 1 
wr(a) ~ f dy, tom. (3.7) 
bl Ja So rod? 


This can be compared with Theorem VIII.2.1 to see the reduction of y(u) 
through investment. If further the failure rate of B tends to zero, then the 
rate at which u*(x) goes to infinity can be identified to be 


bi f?1—-B(2) 


wee} 1 Be) 


dz, to. 


In particular, if B(x) is regularly varying with index —a, then simple applica- 
tions of Karamata’s theorem translate (3.7) into 


23 b3(a +1) 
bia 


wr(2) 1 B(z)), LOO, 


and 
by 


“Bat” 


3This is in contrast to investment of a constant fraction of the reserve, which led to a 
Pareto-type tail even for light-tailed B, cf. Theorem VIII.6.2. 
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Accordingly, for  — oo it is then optimal to invest the constant fraction 
b, /(b3(a@ + 1)) of the surplus into the risky asset.4 For a derivation and de- 
tailed discussion of the above results the reader is referred to Schmidli [779, 
Ch. IV]. 


Example 3.7 (OPTIMAL REINSURANCE FOR THE CRAMER-LUNDBERG MODEL) 
Assume again the Cramér-Lundberg risk reserve process R}? = x + t — So U; 
and the purchase of reinsurance on individual claims according to some rein- 
surance form uz; under which the cedent reduces a possible claim payment U; 
at time t to uz(U;) (with the implicit understanding that u(y) is a continuous 
function satisfying 0 < u(y) < y). The goal is to minimize the ruin probabil- 
ity through dynamically adapting the reinsurance form u, (to avoid trivialities, 
the admissible strategies uz, are again assumed to be predictable, cf. Example 
3.6). The premium intensity for such a reinsurance contract is pr(uz) for a 
continuous function pr with the understanding that more reinsurance is more 
expensive and that full reinsurance is more expensive than first insurance (i.e. 
if u(U;) = U;, then prlu) > 1), as otherwise it would be optimal to reinsure 
the entire insurance risk, leading to a ruin probability of zero. The controlled 
surplus process is now given by 


t 
RY =v +f (1 — pr(us))ds — X u(U;). 
0 
The HJB equation for this optimization problem then reads 


sup {0 -pe VC) +3 ( [ ve- Bea) - vie) } = 0, 8) 


where at this stage u is a function (rather than a constant) representing the 
reinsurance form and Wis the (compact) set of all admissible reinsurance forms. 
Here V again corresponds to the survival probability of the controlled process. 
Since we are interested in strictly increasing solutions of (3.8), we can restrict 
U to those admissible strategies for which pr(u) < 1. If one specifies a bound- 
ary value, then (with quite some effort) this equation can be shown to have a 
unique, strictly increasing and continuously differentiable solution and that this 
solution (after appropriate scaling) indeed minimizes the ruin probability of the 
controlled process. 


“Theorem VIII.6.2 together with Remark VIII.6.4 show that adopting this constant fraction 
strategy for all x also leads to the same asymptotic behavior of r(x), but for finite x the 
performance can be quite different! 


456 CHAPTER XIV. STOCHASTIC CONTROL 


More explicit results can be obtained, if the set X is restricted to particu- 
lar reinsurance forms. For instance, if one tries to find the optimal dynamic 
proportional reinsurance u(y) = wy with 0 < u < 1, then (3.8) simplifies to 


x/u 
sup fa — pr(u)) V (z) + 8 (/ V(x — uy) B(dy) — vw) ) =0. (3.9) 
u€ [0,1] 0 
One can then show that if infuțı (1 — pr(u))/(1 — u) > 0 (ie. the reinsurer 
charges more than the net premium for u close to 1), then it is optimal to 
purchase no reinsurance for any initial capital x below some positive level. 

Equation (3.9) cannot be solved explicitly, so one has to approximate the 
solution numerically in practical examples. Asymptotic results for x — co can 
however be obtained. In particular, if a strictly positive solution yr to the 
adjustment equation 

inf {@Blur] — 8 — (1—pr(u))r} =0 (3.10) 
u€ [0,1] 

exists, then under some mild additional assumptions the Cramér-Lundberg ap- 


proximation 
lim e7?*yr(x) = C (3.11) 


zr CO 
holds for some constant C > 0. If moreover the value u* for which the infimum 
in (3.10) is attained is unique, then one can show that limz... u(x) = u*, 
i.e. for increasing initial capital x the optimal strategy converges to a constant 
reinsurance fraction. 
If on the other hand B is regularly varying with index —a, then 


Wr(z) ~ ( inf m ) Bota) 


ue [0,1] (1 — pr(u) — Bupu)t 

and if the infimum is attained for a unique value u*, then again limy_.. u(x) = 
u*. For subexponential, but lighter tails B, available results are not as explicit, 
but in that case the optimal strategy can be shown to satisfy limsup,_,,, u(x) = 
inf{u : 1—pr(u) > ugu}. Le. for large x one tries to reinsure as much as 
still possible without resulting negative drift. The intuitive reason is that (in 
contrast to regularly varying tails) proportional reinsurance makes the tail of 
the distributions smaller and so one tries to purchase as much reinsurance as 
possible. 

As another example one can consider the restricted class of excess-of-loss 
strategies u(y) = min(y, u), so in this case u is the retention. This restriction 
leads to the HJB equation 


a(x,u) 
sup fa — pr(u)) V(x) + af V(x —min(y, u)) B(dy) — Bv(x)} = 0, 


uw€[0,co] 
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where a(z,u) is x if u > x and infinity otherwise. Under mild additional as- 
sumptions one can then show that if the adjustment coefficient yp, this time 
defined as the positive solution of 


inf {ar f° (1 — B(z))e"* dz — (1 - pa(w)r} = 


u€[0,co] 


exists, then the Cramér-Lundberg approximation (3.11) again holds and if this 
infimum is attained for a unique value u*, then limy,... u(x) = u*. Note that yr 
now also exists for heavy-tailed claim size distributions and both the Cramér- 
Lundberg approximation and the limiting strategy result still apply. 


Notes and references Schmidli [779] is a recent and rich source where one finds 
rigorous treatments of the above examples and numerous further stochastic control 
problems in insurance. A short survey of the topic is in Hipp [466]. Browne [206] 
also extends Example 3.4 in several further directions, including dependent Brownian 
motions, minimizing expected penalty at ruin and maximizing exponential utility in 
finite time. Optimal investment problems for the Cramér-Lundberg model were first 
studied by Hipp & Plum [469] and Gaier, Grandits & Schachermayer [386] and in a 
more general framework in [470]; since then many further results have been added. 
Among them, Gaier & Grandits [385] and Grandits [434] extend the optimal invest- 
ment problem for regularly varying claims to the case when in addition a riskless asset 
is available, see also Liu & Yang [601] and Yang & Zhang [902]. For a periodic risk 
model, Kotter & Bauerle [558] investigate the control problem to maximize the ad- 
justment coefficient through investment. 

The classical references for optimal reinsurance programs are Hgjgaard & Taksar 
[480], Schmidli [775] and Hipp & Vogt [472]. In a similar fashion, investment and 
reinsurance can be controlled simultaneously, usually without substantial additional 
complexity, see e.g. Schmidli [777] and Luo & Taksar [617] for minimizing the proba- 
bility of absolute ruin. 

Whereas in most considered investment problems it is allowed to borrow money 
for the purchase of the risky asset if necessary, Promislow & Young [717] and Azcue 
& Muler [115] deal with the effects of borrowing constraints, in which case one has to 
work with weak solutions to the HJB equation. Luo [616] and Bai & Guo [123] deal 
with several available risky assets. 

The continuous-time risk model leads to elegant and often very explicit solutions 
for the control problems. However, similar to the dynamic hedging problem in finance, 
it will be practically impossible to continuously adjust the investment/reinsurance 
fraction. It is still a challenge for future research to incorporate frictions such as trans- 
action costs and/or limited possibilities for portfolio adjustment into the model; for 
a step in this direction, see Hgjgaard & Taksar [481]. Optimal control strategies of 
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reinsurance and investment to minimize the ruin probability in a multivariate discrete- 
time risk model can be found in Bauerle & Blatter [141]. 

Other types of control problems with the objective to minimize the ruin proba- 
bility include the possibility to accumulate new business (see Hipp & Taksar [471]) 
and to choose between proportional insurance and the issuing of catastrophe bonds 
which are correlated with the insurer’s losses (see Bauerle [149]). The related topic 
of maintaining solvency for pension plans is considered in Olivieri & Pitacco [673]. 
Optimal investment and reinsurance when instead of minimizing the ruin probability 
the objective is to maximize the utility of terminal wealth is investigated in Irgens & 
Paulsen [495]. See also Korn & Wiese [553] for the case where the size of the resulting 
ruin probability is a constraint, Zhang & Siu [913] for a game-theoretic approach that 
involves model uncertainty and Xia & Zhang [874] where a martingale approach is em- 
ployed to identify mean-variance efficient inv estment strategies. For a general model 
set-up, see Liu & Ma [603]. Another methodological bridge between problems of the 
above kind and more finance-oriented control problems can be found in Bayraktar & 
Young [145, 146, 147], who consider an individual who consumes at a certain (possibly 
surplus-dependent or random) rate and can investment in a risk-less and a risky asset 
in such a way that the probability of ruin before a random time horizon is minimized; 
for the objective to maximize the expected utility of consumption and the size of the 
ruin probability being a constraint, see [148]. 

Another classical stochastic control problem in insurance (originally raised by de 
Finetti [283]) is how to pay out dividends from the risk reserves to shareholders in such 
a way that the expected (utility of the) discounted sum of dividend payments until 
ruin is maximized. The resulting value function is a profitability type measure for the 
value of an insurance portfolio and as such may be interpreted as an alternative to 
the ruin probability which rather measures the safety. There also have been studies of 
optimal strategies that balance between profitability and safety, expressed through a 
penalty term on early ruin, see e.g. Thonhauser & Albrecher [843]. The corresponding 
control problems lead to intricate mathematical challenges and have developed into 
an active field of research, which cannot be covered in this book (see e.g. Albrecher & 
Thonhauser [40] for a recent survey and Schmidli [779] for a detailed treatment. Cf. 
also the Notes of VIII.1). 

In addition to the analytic approach outlined in this chapter, sometimes a more 
probabilistic approach also works in which the control problem is solved within a re- 
stricted smaller class of admissible strategies and then by comparison one can show 
that the optimal strategy is also optimal within the whole class of admissible strategies 
(see e.g. Loeffen [604]). 

In finance applications, a popular and workable alternative to the dynamic pro- 
gramming principle and the HJB equation is the so-called dual method. For insurance 
applications, however, it seems that due to the intervention of the control into the 
underlying surplus process, the resulting set of possible trajectories is too restricted to 
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make the dual method work here. Concerning general methods, some standard text- 
books on continuous time stochastic control are Davis [278], Fleming & Soner [364], 
Øksendal & Sulem [672], Pham [698] and, for numerics, Kushner & Dupuis [541]. 
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Chapter XV 


Simulation methodology 


1 Generalities 


This section gives a summary of some basic issues in simulation and Monte Carlo 
methods. We shall be brief concerning general aspects and refer to standard 
textbooks like Asmussen & Glynn [79], Bratley, Fox & Schrage [197], Ripley [740] 
or Rubinstein & Kroese [751] for more detail (a treatment with a special view 
towards insurance is Korn, Korn & Kroisandt [552]); topics of direct relevance 
for the study of ruin probabilities are treated in more depth. 


la The crude Monte Carlo method 


Let Z be some random variable and assume that we want to evaluate z = EZ 
in a situation where z is not available analytically but Z can be simulated. The 
crude Monte Carlo (CMC) method then amounts to simulating i.i.d. replicates 
Zı,..., Zy, estimating z by the empirical mean Z = (Z, +---+ Zy)/N and the 
variance of Z by the empirical variance 


N N 


1 1 
T N Ze ze ——( Z? - Nz). 1.1 
s N-I. Z) >, i Z (1.1) 


According to standard central limit theory, VN (Z — z) 2 N(0, 07), where o}, = 
Var(Z). Hence 

_ , 196s 
ZE JN 
is an asymptotic 95% confidence interval, and this is the form in which the result 
of the simulation experiment is commonly reported. 


(1.2) 
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In the setting of ruin probabilities, it is straightforward to use the CMC 
method to simulate the finite horizon ruin probability z = y(u, T): just simulate 
the risk process {R+} up to time T (or T A7(u)) and let Z be the indicator that 
ruin has occurred, 

Z = I( inf Ri <0) = I(r(u) <T). 
O<t<T 
The situation is more intricate for the infinite horizon ruin probability y(u). The 
difficulty in the naive choice Z = I(T(u) < oo) is that Z cannot be simulated in 
finite time: no finite segment of {S;} can tell whether ruin will ultimately occur 
or not. Sections 2-5 deal with alternative representations of y(u) allowing to 
overcome this difficulty. 


1b Variance reduction techniques 


The purpose of the techniques we study is to reduce the variance on a CMC 
estimator Z of z, typically by modifying Z to an alternative estimator Z’ with 
Z’ = EZ = z and (hopefully) Var(Z’) < Var(Z). This is a classical area of the 
simulation literature, and many sophisticated ideas have been developed. Typi- 
cally variance reduction involves both some theoretical idea (in some cases also 
a mathematical calculation), an added programming effort, and a longer CPU 
time to produce one replication. Therefore, one can argue that unless Var(Z’) 
is considerable smaller than Var(Z), variance reduction is hardly worthwhile. 
Consider for instance Var(Z’) = Var(Z)/2. Then replacing the number of 
replications N by 2N will give the same precision for the CMC method as when 
simulating N’ = N replications of Z’, and in most cases this modest increase of 
N is totally unproblematic. 

We survey two methods which will be used below to study ruin probabil- 
ities, conditional Monte Carlo and importance sampling. However, there are 
others which are widely used in other areas and potentially useful also for ruin 
probabilities. We mention in particular (regression adjusted) control variates, 
stratification and common random numbers. 


Conditional Monte Carlo 


Let Z be a CMC estimator and Y some other r.v. generated at the same time as 
Z. Letting Z’ = E[Z | Y], we then have EZ’ = EZ = z, so that Z’ is a candidate 
for a Monte Carlo estimator of z. Further, writing 

Var(Z) = Var(E[Z|Y]) +E(Var[Z|Y]) = Var(Z’) + E(Var[Z | Y]) 


and ignoring the last term shows that Var(Z’) < Var(Z) so that conditional 
Monte Carlo always leads to variance reduction. 
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Importance sampling 


The idea is to compute z = EZ by simulating from a probability measure P 
different from the given probability measure P and having the property that 
there exists a r.v. L such that 


z = EZ = JLZ]. (1.3) 


Thus, using the CMC method one generates (Z1, L1), ..., (Zyn, Ly) from P and 
uses the estimator 


and the confidence interval 


1.96 s 1 3 1 D 
= : Is = = 
ZISE JN where Ža = NI AE = NI (XL L3Z?-Nz5) . 


i= 


In order to achieve (1.3), the obvious possibility is to take P and P mutually 
equivalent and L = dP/dP as the likelihood ratio. 
Variance reduction may or may not be obtained: it depends on the choice of 
the alternative measure P, and the problem is to make an efficient choice. _ 
To this end, a crucial observation is that there is an optimal choice of P: 
define P by dP/dP = Z/EZ = Z/z, i.e. L = z/Z (the event {Z = 0} is not a 
concern because P(Z = 0) = 0). Then 


Var(LZ) = E(LZ)? - [E(LZ)]’ = Ë 


2 
z?]| - [=z] pe 2 eG 


lp 


Thus, it appears that we have produced an estimator with variance zero. How- 
ever, the argument cheats because we are simulating since z is not avaliable 
analytically. Thus we cannot compute L = Z/z (further, it may often be im- 
possible to describe P in such a way that it is straightforward to simulate from 
P). 

Nevertheless, even if the optimal change of measure is not practical, it gives 
a guidance: choose P such that dP/ dP is as proportional to Z as possible. This 
may also be difficult to assess, but tentatively, one would try to choose P to 
make large values of Z more likely. 


lc Rare events simulation 


The problem is to estimate z = P(A) when z is small, say of the order 1073 
or less. Le., Z = I(A) and A is a rare event. In ruin probability theory, A = 
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{r(u) < T} or A = {r(u) < co} and the rare events assumption amounts to u 
being large, as is the case of typical interest. 

The CMC method leads to a variance of o% = z(1 — z) which tends to zero 
as z | 0. However, the issue is not so much that the precision is good as that 
relative precision is bad: 

oz z(1-— z2) 1 

Z z i yz z 
In other words, a confidence interval of width 1074 may look small, but if the 
point estimate Z is of the order 1075, it does not help telling whether z is of 
the magnitude 1074, 1075 or even much smaller. Another way to illustrate the 
problem is in terms of the sample size N needed to acquire a given relative 
precision, say 10%, in terms of the half-width of the confidence interval. This 
leads to the equation 1.96 ¢z/(zVN) = 0.1, ie. 


100-1.962z(1—z)  100-1.96? 


N = 5 


z Z 


increases like z~! as z | 0. Thus, if z is small, large sample sizes are required. 

We shall focus on importance sampling as a potential (though not the only) 
way to overcome this problem. The optimal change of measure (as discussed 
above) is given by 


PB) = E|; B] = T P(AB) = P(BIA). 


Le., the optimal P is the conditional distribution given A. However, just the 
same problem as for importance sampling in general comes up: we do not know 
z which is needed to compute the likelihood ratio and thereby the importance 
sampling estimator, and further it is usually not practicable to simulate from 
P(-|A). Again, we may try to make P look as much like P(-|A) as possible. An 
example where this works out nicely is given in Section 3. 

Two established efficiency criteria in rare events simulation are bounded rel- 
ative error and logarithmic efficiency. To introduce these, assume that the 
rare event A = A(u) depends on a parameter u (say A = {T(u) < co}). For 
each u, let z(u) = P(A(u)), assume that the A(w) are rare in the sense that 
z(u) > 0, u > oo, and let Z(u) be a Monte Carlo estimator of z(u). We then 
say that {Z(u)} has bounded relative error if Var (Z(u)) /z(u)? remains bounded 
as u — oo. According to the above discussion, this means that the sample size 
N = N-.(u) required to obtain a given fixed relative precision (say €e =10%) 
remains bounded. Logarithmic efficiency is defined by the slightly weaker re- 
quirement that one can get as close to the power 2 as desired: Var(Z(u)) should 
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go to 0 as least as fast as z(u)?~*, i.e. 


jadi Var (Z(u)) 


u— o0 z(u)? =E 


(1.4) 


for any e > 0. This allows Var(Z(u)) to decrease slightly slower than z(u)?, 


so that N.(u) may go to infinity. However, the mathematical definition puts 
certain restrictions on this growth rate, and in practice, logarithmic efficiency 
is almost as good as bounded relative error. The term logarithmic comes from 
the equivalent form 
— log Var(Z(u 
liminf TAA ( ( )) 


> 9 1. 
u— oo — log z(u) =, ( 5) 


of (1.4). 
Notes and references A survey on rare events simulation is in Asmussen & Glynn 
(79, Ch. VI]. See also Juneja & Shahabuddin [512]. 

For details on random variate generation to implement CMC methods and its re- 
finements we refer to the textbooks mentioned at the beginning of the section. The 
traditional approach is pseudo-random numbers generated by some recursion. In fi- 
nance applications, quasi-random numbers ([79, [X.3]) have recently become popular 
and often lead to a substantial improvement of precision. However, it is folklore that 
quasi-random numbers perform less well when the time horizon is random (say a stop- 
ping time like the ruin time) rather than fixed. For an illustration, see [79, p. 274]. If 
however, an algorithm can be designed which, instead of the risk process, needs the 
simulation of some other quantities with fixed dimension, quasi-random numbers can 
be competitive, see e.g. Coulibaly & Lefèvre [253]. 


2 Simulation via the Pollaczeck-Khinchine for- 
mula 


Consider the compound Poisson model, let X1, X2,... be iid. with common 
density bo(x) = B(x)/pp, let Sn = X,+---+Xy and let K be independent and 
geometric with parameter p, P(K = k) = (1 — p)p*. The Pollaczeck-Khinchine 
formula IV.(2.2) may be written as y(u) = P(M > u), where M = Sx. Thus 
y(u) = z = z(u) = EZ, where Z = I(M > u) may be generated as follows: 


1. Generate K as geometric, P(K = k) = (1 — p)p*. 
2. Generate X,,...,X x from the density bo(x). Let M — Sx. 


3. If M >u, let Z — 1. Otherwise, let Z — 0. 
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The algorithm gives a solution to the infinite horizon problem, but as a CMC 
method, it is not efficient for large u. Therefore, it is appealing to combine it 
with some variance reduction method. 


2a Light tails: importance sampling 


With light tails, there is a standard way to perform importance sampling for 
geometric sums. In the present ruin context (assuming the conditions of the 
Cramér-Lundberg approximation), it amounts to the following. As set-up, note 
that an easy argument using integration by parts shows that the Lundberg 
equation for the adjustment coefficent y can alternatively be written as 


1 = pf a Bo(dy) . (2.1) 


Let Bj be the distribution defined by dBj/dBo(x) = pe”, and to generate one 
replication of the estimator, generate XÏ, X3,... from Bj. Let S% = XÍ +--+ 
X*. Stop the simulation at r*(u) = inf {n: S* > u} and return the estimator 
Z* (u) =e7 Vr, 

To understand the algorithm, note first that z(u) = P(r(u) < N). Next let 
P* be the probability measure where the X; are i.i.d. with distribution Bj and 
N remains independent and geometric(p). Then by the definition of Bj, 


P*(Xj € du) = - * [e VAL. X* € dul. 


By a standard extension to stopping times (see, e.g., [79, pp. 131-132]), this 
implies 


1 » «o* 

z(u) = E* laa e77 Sw: T*(u) < n] = E*e 7 Sr), 

pr (u) 

where we used that N remains geometric and independent of the X; under P*. 


Le., the estimator Z*(u) is unbiased. 
Further 


o* Z* (u)? = E*e 27Sru) < e72 = O(z(u)’) , 


where the last step used the standard Cramér-Lundberg asymptotics z(x) ~ 
Ce—™. This shows: 


Theorem 2.1 The estimator Z*(u) has bounded relative error. 
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2b Heavy tails: conditional Monte Carlo 


With heavy tails, the first efficient algorithm seems to be that of Asmussen & 
Binswanger [72], which gives a logarithmically efficient estimator when the claim 
size distribution B (and hence Bo) has a regularly varying tail. So, assume in 
the following that Bo(x) ~ L(x)/x® with a > 0 and L(x) slowly varying. Then 
(cf. Theorem X.2.1) y(u) ~ p/(1 — p)Bo(x), and the problem is to produce 
an estimator Z(u) with a variance going to zero not slower (in the logarithmic 
sense) than Bo(u)?. 
A first obvious idea when using conditional Monte Carlo is to write 


yu) = P(Xi+---+XxK >u) 
= P(X + +--+ XK >ulX,...,XK-1] 
= Bolu — Xı —--» — XK_1). 
Thus, we generate only X1,...,Xg-1, compute Y = u — Xı —-:--— XK_] 


and let Z® (u) = Bo(Y) (if K = 0, Z® (u) is defined as 0). As a conditional 
Monte Carlo estimator, Z® (u) has a smaller variance than Zı(u). However, 
asymptotically it presents no improvement: the variance is of the same order of 
magnitude F(x). To see this, just note that 


ZO (u)? 


IV 


Bolu — X,—---— Xk); Xi >u, K > 2] 
pPP(X >u) = PBo(u) 


(here we used that by positivity of the X;, Xı +--+ Xx-—ı > u when Xj >u, 
and that Bo(y) = 1, y < 0). 

This calculation shows that the reason that this algorithm does not work 
well is that the probability of one single X; to become large is too big. The idea 
of [72] is to avoid this problem by discarding the largest X; and considering only 
the remaining ones. For the simulation, we thus generate K and Xj,...,XxK, 
form the order statistics 


Xa) < X(2) Lee < XK), 


throw away the largest one X(x), and let 


II 


Z)(u) P(Sx >u | Xa) Xo- X1) 
Bo((u — Sir-1)) V Xix-1)) 
Bo(X(K-1)) 


where SiKk-1) = Xa) + Xo) +++: + Xx-1). To check the formula for the 


3 
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conditional probability, note first that 
P(X(n) > 2| Xa), Xe Xa) = 


We then get 


P(S, > £ | Xa) Xo): Xin-1)) 
= P(X + Sa- > 2 | Xa Xo- Xan) 
= P(X) > 2- Sn- | Xay Xoo Xa) 
Bo((# = S(n—1)) V X(n-1)) 
Bo(X(n-1)) l 


Theorem 2.2 Assume that Bo(x) = L(x)/x® with L(x) slowly varying. Then 
the algorithm given by {Z)(u)} is logarithmically efficient. 


The proof of Theorem 2.2 is elementary but lengthy. We will omit it, since 
another equally simple conditional Monte Carlo estimator developed later by 
Asmussen & Kroese [89] performs better. The idea there is to partition according 
to which X; is the largest, i.e., for which 7 one has Mn = X(n) = Xi, and 
condition on the X; with j # i. Since clearly by symmetry P(S, > u) = 
nP(S, > u, Mn = Xn), this gives the estimator 


ZE (u) nP(Sn > u, Mn = Xn |n, X1,- -, XN—1) 


= n Bo(Mn-1 V (u = Sn—1)) (2.2) 


when N = n is deterministic (note that for M, = Xn we need X,, > M,_1 and 
for Sn > u, we need Xn > u— Sp-1), and 


73") (u) 


II 


NP(Sy >u, My = Xn |N, Xi, ..., Xn) 
N Bo(Mn-1 V (u = Sn-—1)) (2.3) 


II 


when N is random. 


Theorem 2.3 The estimator Z8)(u) has bounded relative error in the regu- 
larly varying case, and is logarithmically efficient in the Weibull case provided 
B < B = log(3/2)/log 2 = 0.585. The same holds for Z?)(u) in the regularly 
varying case provided L(-) satisfies 


lim sup t[L(u/N)?N2°+?] < œ. 


1 
u>œ L(u)? 
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Proof. We consider only the regularly varying case and the case of a determin- 
istic N = n. If Mn-ı < u/n, then S,_1; < (n — 1)u/n and therefore always 
Mn-1V (u — Sn—1) > u/n. Therefore 


Zu)? aBolu/n)? _ 9 L(u/n)?/(u/n)?4 
Bolu)? Bo(u)? L(u)? [ure 
= pote Ltuln)? ~ maton 

L(u)? 


Noting that z(u) ~ nF(u) by subexponentiality completes the proof. 


Notes and references In the case ZE” (u) of a random N, it is suggested in 
[89] that either N be used as a control variate or that N be stratified, and a sub- 
stantial variance reduction was obtained. A theoretical support for the control-variate 
approach was provided by Hartinger & Kortschak [453], who showed that in fact, in 
this setting the relative error goes to 0 as u — oo. 


2c Heavy tails: importance sampling 


Asmussen, Binswanger & Højgaard [73] suggested an importance distribution 
Bo that is much heavier than By. They showed for example that for the 
regularly varying case and the tail of Bo being of order 1/log x, this gives 
bounded relative error. The practical experience with the algorithm is, how- 
ever, discouraging, and a much better importance distribution was suggested by 
Juneja & Shahabuddin [511]. They suggested that the tail of Bo be changed to 
c1Bo(x)®™ on [a,00) and that the density cgbo(a) be used on (0, £o), where 
0(u) — 0 and c1,c2 have to be chosen in a certain way. We will not give the 
details but only present a simplified version of the algorithm in the Pareto case 
bo = (a—1)/(1+2)%, where we again choose bo(x) = (&— 1)/(1 +x) as Pareto 
with & = a(u) = ab (u) — 0 (the regularly varying case is an easy extension). 
Thus the estimator is 


N 
Z® (u) = (Sy > ge (2.4) 


i=l 


with the r.v.’s simulated as independent_under the measure P where N is 
geometric(p) and X,,...Xy have density bo(x). 


Theorem 2.4 The estimator Z“ (x) is logarithmically efficient in the Pareto 
case provided log a/ logu — 0. 
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Proof. Let ZA) (w) denote (2.4) with N replaced by some fixed n. Then 


Oa)? = bolz)? | bollan) T 60). F te de... de 
ZO (u) f E ee FE Tole) Pulen) dza =- den 


2 rf J bi (a1) bt (ey) der > 
Lyte +Ly >L 
= o,"P#(S, > 2), 


where (cy)! = f 62/bp and bf = cyb2/bo. Now 


Si PO aE aes ASAE «3 a 
ei =), @—D/ata)® * ~ a@a-a) ~ 2a 25) 


Bounding P2a—-a(Sn > x) above and below by 


Pog—e(Spn > x£) ~ , respectively Poa+e(Sn > 2) ~ 


n 
g2a—e g2ate 4 

letting € | 0 and using log @/ log u — 0 gives easily that ZA) (u) is logarithmically 
efficient for P(S,, > u). We omit the details that are needed to deal with a 
geometric N. 


Notes and references Asmussen, Binswanger and Højgaard [73] give a general 
survey of rare events simulation for heavy-tailed distributions. In many aspects the 
findings of [73] are quite negative: the large deviations ideas which are the main 
approach to rare events simulation in the light-tailed case do not seem to work for 
heavy tails. It must be noted that a main restriction of all algorithms considered in 
this section is that they are so intimately tied up with the compound Poisson model 
because the explicit form of the Pollaczeck-Khinchine formula is crucial (say, in the 
renewal or Markov-modulated model P(T} < oo) and G4 are not explicit). 

A further interesting and useful idea applicable in the Pollaczeck-Khinchine frame- 
work with heavy tails was given by Juneja [510]. He noted that 


P(Sn >u,My >u|N) = P(My >u|N) = 1- Bo(u)™ 


is explicit, so that only P(Sw > u, Mn < u) needs to be simulated. 


3 Static importance sampling via Lundberg con- 
jugation 


We consider again the compound Poisson model and assume the conditions of 
the Cramér-Lundberg approximation so that z(u) = w(u) ~ Ce~%™, use the 
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representation y(u) = Ee% = e7™Ezre™® where E(u) = Sru) — u is 
the overshoot (cf. IV.5), and simulate from Pz, that is, using 8z, Bz instead of 
B, B, for the purpose of recording Z(u) = e7797™. 

For practical purposes, the continuous-time process {S+} is simulated by 
considering it at the discrete epochs {0p} corresponding to claim arrivals. Thus, 
the algorithm for generating Z = Z(u) is: 


1. Compute y > 0 as solution of the Lundberg equation 
0 = (7) = (Bhl -1)-7, 
and define 8z, By by 6r = Bly, By (dx) = e B(dz)/ Bjo]. 
2. Let S—0. 


3. Generate T as being exponential with parameter B and U from B. Let 
S—S+U-T. 


4. If S >u, let Z —e-7%. Otherwise, return to 3. 


There are various intuitive reasons that this should be a good algorithm. It 
resolves the infinite horizon problem since Pz(T(u) < oo) = 1. We may expect 
a small variance since we have used our knowledge of the form of y(u) to isolate 
what is really unknown, namely E,e~%™), and avoid simulating the known 
part e~7. More precisely, the results of V.7 tell that P(- | r(u) < oo) and Pz 
(both measures restricted to ¥,(,)) asymptotically coincide on {7(u) < co}, so 
that changing the measure to Pz is close to the optimal scheme for importance 
sampling, cf. the discussion at the end of Section 1b. In fact: 


Theorem 3.1 The estimator Z(u) = e775" (simulated from Pz) has bounded 
relative error. 


Proof. Just note that EZ(u)? < e% ~ z(u)?/C?. 


It is tempting to ask whether choosing importance sampling parameters B, B 
different from z, Br could improve the variance of the estimator. The answer 
is no. In detail, to deal with the infinite horizon problem, one must restrict 
attention to the case Bug > 1. The estimator is then 


M(u) BT. 
Be-PT dB 
Z — ra = ~~ i . 
o= i gO (3.1) 


where M(u) is the number of claims leading to ruin, and we have: 
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Theorem 3.2 The estimator (3.1) (simulated with parameters B, B) is not log- 
arithmically efficient when (8, B) 4 (BL, BL). 


The proof is given below as a corollary to Theorem 3.3. 

The algorithm generalizes easily to the renewal model. We formulate this 
in a slightly more general random walk setting.! Let X1, X2,... be iid. with 
distribution F, let Sn = Xı +--+ Xn, M(u) = inf {n : Sn > u}, and assume 
that up < 0 and that Fj] =f; Fy] < co for some y > 0. Let Fy(dx) = 
e?” F(dx). The importance sampling estimator is then Z(u) = e775mw, More 
generally, let F be an importance sampling distribution equivalent to F and 


Z(u) = lI Tx). (3.2) 


Theorem 3.3 The estimator (3.2) (simulated with distribution F' of the X;) has 
bounded relative error when F = Fr. When F # Fz, it is not logarithmically 
efficient. 


Proof. The first statement is proved exactly as Theorem 3.1. For the second, 
write 


By the chain rule for Radon-Nikodym derivatives, 


eZ (u)? = ERW(F|F) = Ep [WFF WF) 


= Er [W(PIFL)W(FLIP)] = Exexp{Ki +--+ Kuyt: 


where 


dF, 
dF 


Ki = log( o(a) ) a oe (XD QyX;j. 


Here Ez K; = e — 2yE,X;, where 


dF 
= ioe AO 
€ Llog a | )>0 


lFor the renewal model, X; = U; — T;, and the change of measure F — Fy, corresponds to 
B — BŁ, A —> Az as in Chapter VI. 
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by the information inequality. Since Ky, K2,... are i.i.d., Jensen’s inequality 
and Wald’s identity yield 


tgZ(u)? > exp{E,(K,+---+Kyuq)} 
= exp{ zM (u) (e — 2y iL Xi) }- 
Since Ez M (u)/u — 1/EŁX;, it thus follows that for 0 < e < ¢/E,X;, 


ji zz Z(u)? ji pZ (u)? 
imsup ———-_ = _ limsup e 
eed ouje eas C oe itey 
1 = tea 
> limsup ——— = = > 0, 
T sent C2e7274 C2 


which completes the proof. 


Proof of Theorem 3.2. Consider compound Poisson risk process with intensi- 
ties 3’, 8”, generic interarrival times T”, T”, claim size distributions B’, B” and 


generic claim sizes U',U”. Then according to Theorem 3.3, all that needs to 


be shown is that if U’ — T’ 2 U” — T", then p’ = 6”, B’ = B”. First by the 


memoryless property of the exponential distribution, U’ — T’ has a left expo- 
nential tail with rate 3’ and U” — T” has a left exponential tail with rate g’. 
This immediately yields 8’ = 8”. Next, from 


P(U’ —T’ > x) 


= if B'e -P YB (a+ y)dy = pers f e` B (z2) dz, 
0 x 


P(U” — T" > 2) 


= J B"e-P"YB" (x + y) dy = ares’ f eB" Bl" (z) dz 
0 


x 


(x > 0) and 8’ = 8", U' —T’ 2 U” — T”, we conclude by differentiation that 


B'(x)= B" (a) for all x > 0, i.e. B' = B". 
Notes and references The importance sampling method was suggested by Sieg- 
mund [807] for discrete time random walks and further studied by Asmussen [56] in the 
setting of compound Poisson risk models. The optimality result Theorem 3.1 is from 
Lehtonen & Nyrhinen [576], with the present (shorter and more elementary) proof 
taken from Asmussen & Rubinstein [99]. In [56], optimality is discussed in a heavy 
traffic limit 7 | 0 rather than when u — oo. 

The extension to the Markovian environment model is straightforward and was 
suggested in Asmussen [58]. Further discussion is in Lehtonen & Nyrhinen [577]. 

The queueing literature on related algorithms is extensive, see e.g. the references 
in Asmussen & Rubinstein [99], Heidelberger [455] and Juneja & Shahabuddin [512]. 
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4 Static importance sampling for the finite hori- 
zon case 


The problem is to produce efficient simulation estimators for y(u, T) with T < 
oo. As in V.4, we write T = yu. The results of V.4 indicate that we can expect 
a major difference according to whether y < 1/K'(y) or y > 1/k’(y). The easy 
case is y > 1/«K'(y) where y(u, yu) is close to y(u), so that one would expect 
the change of measure P — Pz to produce close to optimal results. In fact: 


Proposition 4.1 If y > 1/K' (y), then the estimator Z(u) = e797% I(r(u) < 
yu) (simulated with parameters Br, Br) has bounded relative error. 


Proof. The assumption y > 1/x’(y) ensures that y(u, yu)/(u) — 1 (Theorem 
V.4.1) so that z(u) = y(u, yu) is of order of magnitude e77. Bounding Ey Z(u)? 
above by e~™, the result follows as in the proof of Theorem 3.1. 


We next consider the case y < 1/k’/(y). We recall that a, is defined as 
the solution of «’(a) = 1/y, that yy = ay — yx(ay) determines the order of 
magnitude of w(u, yu) in the sense that 


— log y(u) 


u 


> Wy (4.1) 


(Theorem V.4.9), and that yy > y. Further 


plu, yu) = e WK, fe ayg(u)+ Tlu)rlay). T(u) < yu] . (4.2) 


Since the definition of a, is equivalent to Ea, T(u) ~ yu, one would expect that 
the change of measure P — Pa, is in some sense optimal. The corresponding 
estimator is 


Z(u) = et Seu trulo) T(7(y) < yu) ; (4.3) 


and we have: 


Theorem 4.2 The estimator (4.3) (simulated with parameters Ba,, Ba, ) is lo- 
garithmically efficient. 


Proof. Since yy > y, we have k(a,) > 0 and get 


Ba, Z(u)? = E, fe 2ay Srcu) +2T(u)r(ay), t(u) < yu] 
< e 2yyu Leis fe 2ay£(u), T(u) Š yu] 
< eo 2% : 
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Hence by (4.1), 


lim inf ZLE VZO) a. p — log Var(Z(u)) 5 


u= — log z(u) u= Yyu 


so that (1.5) follows. 


Remark 4.3 Theorem V.4.9 has a stronger conclusion than (4.1), and in fact, 
(4.1) (which is all that is needed here) can be shown more easily. Let oł = 


limy—oo Vara, (T(u))/u so that (7(u) —yu)/(oyut/?) Z N(0, 1) (see Proposition 
V.4.2). Then 


= va Z(u) > va 


= eraetyunlay) EB, femal Hr(w)—w) 60), yy — oyu? < r(u) < yu] 


[evar Sro tra): yu — oyu? < (u) < yu] 


> e` Wwu+oyu" ’r(ay) Po [e asg(u). yu — oyu? < T(u) 2 yul 


~o eT ttoyu  n(ay) nae ay E (co) (®(1) = 1/2) 


where the last step follows by Stam’s lemma (Proposition V.4.4). Hence 


z 1/2 
lim inf > liminf Jat ou a) = —yy. 
u— oo U U— Oo u 


log z(u) 


That lim sup < follows similarly (but more easily) as when estimating E,, Z (u)? 
above. 


Notes and references The algorithms in the present section are the obvious ones, 
but seem to have been discussed for the first time in the first edition of this book. See 
also Nyrhinen [667]. In Asmussen [56], related discussion is given in a heavy traffic 
limit 7 | 0 rather than when u — oo. 


5 Dynamic importance sampling 


The terms dynamic importance samplingor adaptive importance sampling are 
used in at least two different meanings. One meaning is algorithms that, during 
the execution, change the importance distribution or seek for a good one; a good 
example is the cross-entropy algorithm, Rubinstein & Kroese [751]. The sense in 
which we will understand these terms is in describing algorithms that are level- 
and time-dependent: the importance distribution for (say) the Poisson rate 
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and the claim size distribution at time t in a compound Poisson claim surplus 
process {.S;} depends on the current value S, as well as on t. Algorithms of 
this type have received considerable attention in recent years in areas such as 
queueing theory and have managed to provide efficient algorithms in situations 
where traditional (static) importance sampling got into difficulties. The basic 
idea in most of the papers in the area is to implement the principle of looking 
for a description of the conditional distribution given the rare event. We will 
exemplify this in two settings. Most steps in the variance calculations leading 
to asymptotic efficiency results are omitted since they are always very lengthy 
and technical in the dynamical setting. 


5a An algorithm by Dupuis, Leder and Wang 


We follow Dupuis, Leder & Wang [336]. The setting is again that of estimating 
P(S, > u) where Sn = Xi +- + Xn with X1,...,X, non-negative and iid. 
with common subexponential distribution F' with density f (F can in particular 
be Bo as discussed earlier). 

In dynamic importance sampling, the importance distribution P will generate 
Xp from a density fu,k,x depending both on u,k and Sk-ı = x. Thus, the 
estimator is 


ot Juksa Xk) 


If x > u, obviously no importance sampling is needed. If x < u, x will 
typically be much smaller than u. Basically, the event Sn > u then occurs by 
one of the Xz, £ = k,...,n, exceeding u — x, and the probability that k = £ is 
1/(n — k + 1); otherwise Xx is ‘typical’. This suggests taking 


n-k 1 f(y) 
nki ki n—k+1 F(u- zx) 


Z(u) = I(S, > u). (5.1) 


To = I(y>u—2) 662 


(note that I(y > u — x) f(y)/F(u — x) is the conditional density of Xp given 
Xk > u-z). 

Unfortunately, this idea is too naive to produce efficient estimators, see 
Remark 5.2 below. One needs to replace the conditioning X, > u — x by 
Xk > a(u — x) for some a < 1. As a generalization, [336] also allows weights 
different from the ones in (5.2). Thus, instead of (5.2) one has 


a z l f(y) 
fural) = Pfu) + % Fey ay 


I(y > a(u—2)) (5.3) 


where pk + qx = 1. 
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Theorem 5.1 Assume that F is regularly varying with index a and that the 
importance distribution is given by (5.3). Then for any fixed n, the estimator 
(5.1) has bounded relative error. More precisely, 


DZ (u)? n—1 1 1 n—1 £—1 
— + , U>. 5.4 
Fu) lI E (5.4) 


As said above, the proof is too lengthy to be given here. 


Remark 5.2 Relation (5.4) shows that the closer a is to 1, the more asymp- 
totically efficient is the estimator (5.3). It is therefore tempting to take a = 1. 
However, it turns out that there is a discontinuity at a = 1, and for a = 1, there 
is in fact not even logarithmic efficiency. 

The problem is that the first-order heavy-tailed asymptotics are more im- 
precise than with light tails: realizations with max X, < u but Sn > u are 
asymptotically unimportant, but cannot be neglected for a finite u. This phe- 
nomenon is somewhat related to the slow rate of convergence of heavy-tailed 
approximations. 


Remark 5.3 An obvious question is to find the minimizers pj,...,p% of the 
r.h.s. of (5.4). They are in fact not given by pp = (n — k)/(n — k +1) but by 


(n—k—1)/a%/? +1 
(n—k)/a%/2+1 


Pk = 


(of course, these two expressions coincide as a Ì 1). 


Notes and references Further relevant papers in the same direction are Dupuis 
& Wang [337] and Hult & Svensson [484]. 


5b An algorithm by Blanchet and Glynn 


For a more general discussion of the distribution of a stochastic process given a 
rare event (e.g. ruin), consider a discrete state Markov chain {X,} with tran- 
sition probabilities p(x,y), Let the state space be E, let G C E and tg = 
inf {n : Xn € G}, h(x) = P.(tq@ < œ). Then for any initial value ro ¢ G, the 
conditional distribution of {X,,} given Tg < co is a Markov chain with transition 
probabilities 

hy) 


p“ (x,y) = P(® YF) (5.5) 


See [79, VI.7]; the transition function in (5.5) is referred to as an h-transform. 
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In a ruin context, one is interested in evaluating h(x) or several of the h(x). 
Simulating using the p*(x, y) is of course not practicable since one is simulating 
precisely because the function h is unknown. However, one may try to plug in 
an approximation and adapt this as an importance sampling scheme. 


Example 5.4 Assume that X, = Sn is a random walk with negative drift, 
G = (u,oo) and zo = 0 (of course, the compound Poisson case or the renewal 
model can be handled in this way by looking at the risk process at claim arrival 
instants only). Then h(0) = y(u). In (5.5), write y = x + z and assume for 
simplicity that the increment distribution has a density f. With light tails, we 
then have the Cramér-Lundberg approximation h(v) ~ Ce77“~") for v < u, 
so that (5.5) suggests that the transition density from x to x +z < u be taken 
roughly as 


Ce-V(u-2-2) 
fete) = {OSes = fe”. 


That is, we are back to the Siegmund algorithm discussed in Section 3 and that 
was shown there to give bounded relative error. 

Obviously, this is a promising start for implementing the h-transform ideas. 
However, light tails are the easy case! We will see below that much more 
care is needed for heavy-tailed increments. Here the suggestion from the stan- 
dard subexponential approximations in Chapter X is that the transition density 
p(z, x + z) from x to x+ z <0 be taken roughly as 


x F(u- z- z) 
F+ zle) = D(x) f) 
Fr(u— 2) 
where D(x) is a normalizing constant. However, there are at least two difficulties 
in this choice. First, f*(x + z|x) is not a standard density even in simple cases 
as the Pareto where 


a(l+u—2)%1 


f(a + 2\z) F (1+ z)¢tl(1+u-—a2—-2z)¢71 


so that it is not straightforward to generate r.v.’s from f*(x + z|x). Further, 
f*(x + z|x) depends on x, which makes it far more difficult to bounding the 
variance than in the Siegmund case. 


We will not discuss the r.v. generation issue here, but to resolve the diffi- 
culties in bounding the variance, we return to the general Markov chain case. 
Assume that the importance distribution is a Markov chain with transition 
probabilities of the form p(x, y) = p(x, y)/r(x,y), where for each x one would 
typically try to choose r(x, y) as c(x)/a(y) where a(y) is roughly asymptotically 
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proportional to h(y) and c(x) ensures the normalization ` eg P(x, y) = 1. The 
estimator for z = h(x) is then 


TG 


I(t@ < œ) II T(Xn-1, Xn), (5.6) 


n=1 


with the X, simulated as a Markov chain with Xo = xo and transition prob- 
abilities p(x, y). Had one used instead the p*(x, y), one could be sure that the 
simulation would terminate, i.e. P*(t¢ < co) = 1. Given the way the r(a, y) 
have been chosen, one could hope that also P(t¢ < co) = 1, but this is a separate 
problem that we will ignore in the following. 

As noted above, a crucial but not easy point is to estimate and bound the 
variance of Z, or equivalently the second moment vector mz with elements 
m2(x) = E,Z?, x ¢ G. In the rest of this section, we follow Blanchet & 
Glynn [175]. The main idea of [175] is to use a Lyapounov function technique, 
cf. part (iii) of the following result. Define K as the G° x G° matrix with 
elements k(x, y) = r(x, y)p(x, y), and let 7 be the column vector with elements 
mx) = P2(X1 € G) = J eg k(x, y). Note that mz and 7 have dimension G°. 


Theorem 5.5 (i) The vector mg is the minimal solution to m2 = n+ Kmo. 
(ii) m2 = Veo K”n. 
(iii) Let k be an G°-vector such that Kk < k — n. Then ms < k. 


Given the potential of Theorem 5.5 and the following Corollary 5.6 for the (in 
general very difficult) problem of bounding the variance of the estimator (5.6), 
we give the 


Proof. We have 


m—1 
toZ lre =m] = 5 Pam-i,2m JI r(Xn-1, Xn) Benien . 
Tis- Em—-1fG, £ mEG n=1 


This is the xoth element of the vector K”n. Summing over m, (ii) follows. (i) 
is then an easy consequence. 

For (iii), we have ņ < k — Kk and hence K”n < K” k-— K"™*'k for all m. 
Thus 


XO K”n<k-K""k<k. 


m=0 


Letting n — œ and using (ii) gives (iii). 


Intuitively, one should choose a(x) as a good approximation to h(x) and then 
take k(x) of the form a(x)?k(x) with k(x) = O(1). This idea is made precise 
in the following corollary and its proof: 
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Corollary 5.6 Assume r(x,y) has the form c(x)/a(y) and that 


e(z) X pla, y)a(y)k(y) < k(a)a(a)? (5.7) 


yEE 


for some E-vector k and all y gG. If klx) >1forallx € E andaļly)>r>0 
for all y € G, then mo(x) < K~?a(x)?k(a) for all x ¢ G. 


Proof. We first note that 


ne) = Yivle,y™ < ada) Y penaa) 


A 
“i 
N 
T 
— 
2 
3 
— 
R 
< 
wa 
Q 
— 
ko 
= 
x) 
— 
= 


(5.8) 


Define k(x) = K~2a(x)?k(x). Then for x ¢ G, we have 


Kk? e(x)p(x, y)a(y)k(y) = K’klx, yjaly) kly) = klz, y)k(y). 


Thus combining with (5.8), it follows from (5.7) divided by x? that k > Kk+n. 
Now appeal to Theorem 5.5(iii). 


In the rest of this section, we assume that X, = —u+Y,+---+Y, isa 
random walk with negative drift, subexponential increments and (for simplicity) 
density f(z), and take G = (0,00). The ruin probability is then (wu) = h(—u) 
and Tg is the ruin time. The start of implementing the h-transform ideas is 
easy: as for light tails, we have an approximation for h(v) for v < 0, now 
h(v) ~ a(v) = CF ;(—v) where Fy, is the integrated tail distribution of the 
increment distribution F, cf. X.3.1. In the representation r(x, y) = c(x)/a(y), 
it will be convenient to be able to think of a(y) as the tail of a r.v. Z, that we 
take as the r.v. with 


P(Z >z) = min[1, arf Fees. 


Thus, we take a(y) = P(Z > z) and have 


e(z) = [ vena) dy = Ea(z +Y) = P(Y +Z > —1). 
R 
The most obvious procedure is now to use the estimator 
TG 
c(Xn-1) 
u) = Ire <% TI S 


n=1 


(5.9) 
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However, as for the Dupuis-Leder-Wang algorithm one encounters the difficulty 
that the most obvious choice does not have the desired efficiency properties but 
needs modification, more precisely to 


Z = Zu) = Ire <0) |] Peers (5.10) 


Here x* = x* (y) is taken as in the following lemma (recall the definition of the 
class S* from p. 302): 


Lemma 5.7 Assume Yt € S*. Then: 


(i) ofc) — a(z) = of F(—2)) as £ | -00; 
(ii) given y € (0, 1], there exists x*(y) < 0 such that 


a(x)? = e(x)* 


F(—x)c(x) 


> —y for all x < r*(y). (5.11) 


Theorem 5.8 Assume Yt € S*, let 0 < y< 1, let x* be defined as in (5.11) 
and d(a*) =P(Z > —a*). Then 


uZ (u)? 1 
z S Te 


The proof of Theorem 5.8 and Lemma 5.7 (which is a crucial step in bounding 
the variance) are long and technical, although in principle elementary, and will 
not be reproduced here. We note once more that random variate generation in 
(5.11) is not a standard problem. 


Notes and references A further relevant reference in the setting of the Blanchet- 
Glynn algorithm is Blanchet, Glynn & Liu [176]. 

To summarize our discussion of dynamic importance sampling, the method is not 
straightforward to implement in the heavy-tailed case. The most obvious ideas need 
modification and tuning to produce efficient algorithms, and these steps may require 
tedious calculations, cf. Lemma 5.7. Further, bounding the variance is not straight- 
forward at all and random variate generation may present problems. 

However, the Blanchet-Glynn algorithm is remarkable by being the first to be 
efficient for an infinite horizon problem with heavy tails when no alternative represen- 
tation (say the Pollaczeck-Khinchine geometric sum in the compound Poisson model) 
is available. In fact, it is shown in Bassamboo, Juneja & Zeevi [140] that no static im- 
portance sampling algorithm exists for efficient simulation of the tail of the maximum 
of a random walk (at least in the regularly varying case). 
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6 Regenerative simulation 


Our starting point is the duality representations in ITI.3: for many risk processes 
{R+}, there exists a dual process {V;} such that 


pu, T) = P( inf Ri < 0) = P(Vr >u), 


pu) = P (inf Re < 0) = P(Vx >u), (6.1) 


where the identity for y(u) requires that V; has a limit in distribution Va. 

In most of the simulation literature (say in queueing applications), the object 
of interest is {V;} rather than {R;}, and (6.1) is used to study Væ by simulating 
{R;} (for example, the algorithm in Section 3 produces simulation estimates for 
the tail P(W > u) of the GI/G/1 waiting time W). However, we believe that 
there are examples also in risk theory where (6.1) may be useful. One main 
example is {V;} being regenerative (see A.1): then by Proposition A1.3, 


Hay Pe Say = of 1(V; > u) at (6.2) 


W 


where w is the generic cycle for {V;}. The method of regenerative simulation, 
which we survey below, provides estimates for P(V > u) (and more general 
expectations Eg(V..)). Thus the method provides one answer on how to avoid 
simulating {R+} for an infinitely long time period. 

For details, consider first the case of independent cycles. Simulate a zero- 
delayed version of {V;} until a large number N of cycles have been completed. 
For the ith cycle, record Z = (ZY, ZS) where Z\") = w; is the cycle length, 
Zo) the time during the cycle where {V;} exceeds u and zj = iz”, T= 
Then Z™,...,Z) are iid. and 


0 


Thus, letting 


— 1 1 N 1 1 N 
Ma (ZP +--+ ZA), Ba y(t +2"), 
au) = Z a Bde 
u E a 5 

Zi ge) a Z\) 
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the LLN yields Z, “3 2, Z2 S zo, 


pu) SS 2 = ‘Jo ea = y(u) 


Ži W 


as N — oo. Thus, the regenerative estimator w(u) is consistent. 
To derive confidence intervals, let X denote the 2 x 2 covariance matrix of 
Z. Then 


(Gt nee NOs 


1 
VN 
Therefore, a standard transformation technique (sometimes called the delta 
method, cf. [79, IV.4]) yields 


Eko 2 5 

—— (h (Z1, Z2) — h (z1,z > N(0,0o 

aa (Z1, Z2) — h (21, 22)) (0, oj) 
for h: R? > Rand of = Vh EV’, Vn = (Oh/O% Oh/Oz2). Taking h(z1, z2) = 
zə/zı yields Vp = (—22/z7 1/21), 


1 A D 2 
u u N (0,0 6.3 
Ja OO) > NO, o’) (6.3) 
where 
2 a 1 Z2 
oO = ZU t 722 2 3 12 4 (6.4) 
zi 2i 21 
The natural estimator for & is the empirical covariance matrix 
Psa Ded: , T 
s= zo D ~Z)(Z -Z) 
so o° can be estimated by 
—2 <= 
Z 1 Z 
2 = Si +t y S 25 S12 (6.5) 
Zi Zi Zi 


and the 95% confidence interval is y(u) + 1.96s/ VN. 

The regenerative method is not likely to be efficient for large u but rather a 
brute force one. However, in some situations it may be the only one resolving 
the infinite horizon problem, say risk processes with a complicated structure of 
the point process of claim arrivals and heavy-tailed claims. There is potential 
also for combining it with some variance reduction method. 


Notes and references The literature on regenerative simulation is extensive, and 
we will not attempt a literature survey here. 
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7 Sensitivity analysis 


We return to the problem of IV.9, to evaluate the sensitivity Yc (u) = (d/d¢) y(u) 
where Ç is some parameter governing the risk process. In IV.9, asymptotic 
estimates were derived using the renewal equation for y(u). We here consider 
simulation algorithms which have the potential of applying to substantially more 
complex situations. 

Before going into the complications of ruin probabilities, consider an ex- 
tremely simple example, the expectation z = EZ of a single r.v. Z of the form 
Z = p(X) where X is a r.v. with distribution depending on a parameter ¢. Here 
are the ideas of the two main approaches in today’s simulation literature: 


The score coe (SF) ae Let X have a density f(a,¢) depending 
on ¢. Then z(¢) = f y(x ¢) da so that differentiation yields 


x= $ o ola)ile de = f ole) Erede 


(a/ad) fle, i) ae 
J oa) PAS" f(e, C)de = E|SZ], 


ts (afaQ(x _ 4 
S = = = log f(X,¢ 
E a 
is the score function familiar from statistics. Thus, SZ is an unbiased 
Monte Carlo estimator of zç. 


Infinitesimal perturbation analysis (IPA) uses sample path derivatives. So 
assume that a r.v. with density f(x,¢) can be generated as h(U,¢) where 
U is uniform(0,1). Then 2(¢) = Ey(h(U,¢)), 


x= E[-e(MU.0)] = Bly’ (MU.6) AUO], 


where he(u,¢) = (0/O¢)h(u,¢). Thus, y’ (h(U,¢))he(U, ¢) is an unbiased 
Monte Carlo estimator of z¢. For example, if f(2,¢) = ¢e~$, one can 
take h(U,¢) = — log U/Ç, giving he(U,¢) = log U/¢?. 


The derivations of these two estimators are heuristic in that both use an 
interchange of expectation and differentiation that needs to be justified. For 
the SF method, this is usually unproblematic and involves some application of 
dominated convergence. For IPA there are, however, non-pathological examples 
where sample path derivatives fail to produce estimators with the correct expec- 
tation. To see this, just take y as an indicator function, say y(x) = I(x > xo) 
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and assume that h(U, Ç) is increasing in Ç. Then, for some Ço = Co(U), y(h(U, ¢)) 
is 0 for ¢ < Ço and 1 for ¢ > Ço so that the sample path derivative y’(h(U, ¢)) 
is 0 w.p. one. Thus, IPA will estimate z¢ by 0 which is obviously not correct. 
In the setting of ruin probabilities, this phenomenon is particularly unpleasant 
since indicators occur widely in the CMC estimators. A related difficulty occurs 
in situations involving the Poisson number N; of claims: also here the sample 
path derivative w.r.t. G is 0. The following example demonstrates how the SF 
method handles this situation. 


Example 7.1 Consider the sensitivity wg(u) w.r.t. the Poisson rate 8 in the 
compound Poisson model. Let M(u) be the number of claims up to the time 
T(u) of ruin (thus, T(u) = Ti ++--+Tyy(u))- The likelihood ratio up to T(u) for 
two Poisson processes with rates 3, Bo is 


I(r(u) < œ). 


Taking expectation, differentiating w.r.t. 8 and letting Gp) = 8, we get 
M(u) 1 


Va(u) = | D Goao < «)| 


= | (st = rw) I(r(u) < oo) 


To resolve the infinite horizon problem, change the measure to Pz as when 
simulating w(u). We then arrive at the estimator 


z) = (7P 


for yg(u) (to generate Zg(u), the risk process should be simulated with param- 
eters bL, Bz). 

We recall (Proposition IV.9.4) that Yg (u) is of the order of magnitude ue™?”. 
Thus, the estimation of Yg(u) is subject to the same problem concerning relative 
precision as in rare event simulation. However, since 


p 


= rw) eT eTl) 


2 Za(u)? < ( - r(u)) e72 a Ofu?) , 


we have 
Varr (Zg(u)) O(u?)e~ 27 


zg(u)? uze727 


= O(1) 


so that in fact the estimator Zg(u) has bounded relative error. 
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Remark 7.2 IPA and score functions are not the only ones around. Here are 
some further alternatives: 

Finite differences are simply a stochastic version of numerical differentiation. 
So, assume Z can be generated as h(X, ¢) for a suitable random vector X. Then 
the estimate of z¢ is 


h(X,¢+ h/2) — h(X,¢ — h/2) 
h 


(there are several possible variants; this one uses common random numbers 
and central differences). In many situations, this idea is the simplest one to 
implement. Its problem is that the estimate is biased. In the limit h | 0, (7.1) 
becomes the IPA estimator. 

The idea of weak derivatives is measure-valued differentiation. Suppose that 
we are interested in the sensitivity of Eh(Y , X) w.r.t. Ç, where X has density 
f(x, C) with f'(x, ¢) = Of(a#,¢)/0¢ and Y is a random vector with distribution 
independent of ¢. Since f f’(x,¢)dz = 0, we will typically be able to write 
f'(x,ċ) as kf (x, ¢)—kf_(x,¢) where f(x, ¢), f-(2, ¢) are probability densities 
and k a constant. If W}, W_ are r.v.’s with these densities, we therefore have 


dg s fy x) f(z,¢)dr = |E x) f'(x, ¢) dz 
ZEYD = g YD = EY, fea 


(7.1) 


= E[kh(Y, W4) -— kh(Y,W_)], 
so that the desired estimator can be taken as kh(Y, W+) — kh(Y,W_). For 
example, in the Poisson case f(a,¢) = e~$¢*/a! we get 
Pe) = oS? 7 /(@- 1)! — e Ser /al = f(e-1,0)- Fg) 


(with the convention (—1)/(—1)! = 0) so that k = 1 and we can generate 
W,, W_ as V} +1, V_ with being V}, V_ Poisson(¢). 

Finally, in finance (where the sensitivities go under the name Greeks) meth- 
ods based on formulas from Malliavin calculus have become popular. 


Notes and references A general survey of simulation methods for evaluating 
sensitivities is given in Asmussen & Glynn [79]. For topics not treated there in detail, 
see e.g. Heidergott [456, 457] for weak derivatives, and Fournie et al. [368] and Kohatsu- 
Riga & Montero [549] for the Malliavin approach. A general reference for IPA is 
Glasserman [417], one for the SF method is Rubinstein & Shapiro [754]. 

Example 7.1 is from Asmussen & Rubinstein [100] who also work out a number of 
similar sensitivity estimators, in part for different measures of risk than ruin probabil- 
ities, for different models and for the sensitivities w.r.t. different parameters. 

There has been much work on resolving the difficulties associated with IPA pointed 
out above. In the setting of ruin probabilities, a relevant reference is Vazquez-Abad 
[862]. 


Chapter XVI 


Miscellaneous topics 


1 More on discrete-time risk models 


There are at least two reasons to consider the discrete-time counterparts of 
continuous-time risk models: one is that the resulting approximation can be 
computationally easier to handle, in particular when more complex features like 
interest, investment, dividends and reinsurance are also included. Secondly one 
could claim that all events (claims, premium payments etc.) are in practice only 
observable and/or payable at discrete points in time and so a discrete modeling 
may be considered closer to reality. However, much of the mathematical ele- 
gance and insight is usually lost when replacing continuous-time dynamics by 
discrete ones. If the claim size distribution is also discrete (which is the case 
we consider heret), then the differential equations from the continuous set-up 
are replaced by difference equations and therefore the probability of ruin can 
be calculated recursively for given numerical values of the model parameters. 
A disadvantage of this approach is that it is usually not possible to track the 
influence of model parameters on the final result and consequently the qualita- 
tive behavior of ruin probabilities. On the other hand, the resulting method for 
calculating ruin probabilities and related quantities is simple and general and, 
as we shall see below, some relations and identities of continuous-time models 
have analogues in the discrete-time set-up. 


1Note that at several places in this book we have already dealt with certain discrete-time 
models as approximations for continuous-time models, then usually with continuous claim 
size distributions. Here we focus on the fully discrete model to emphasize the computational 
alternative for obtaining ruin probabilities that it may offer. 
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Assume that the discrete-time risk reserve process {RP} is given by 


RP (u)=u+n-X Xn, neEN, (1.1) 
i=1 
where X1, X2,... are i.i.d. integer-valued non-negative random variables with 


probability function hy = P(X, = k) (k =0,...,00) and c.d.f. H. The inter- 
pretation is that X; is the total claim amount paid in year (or time unit) j. The 


initial capital u is also assumed to be a non-negative integer, so that {RO is 
always integer-valued. We impose throughout the net profit condition EX, < 1. 


Remark 1.1 The model (1.1) is often referred to as the compound binomial 
model. This is justified because in each time interval the total claim size is 0 with 
probability ho > 0 and hence boar Xi = DA Y;, where Np is a binomial(n, 1 — 
ho) r.v. and P(Y; = k) = hy/(1 — ho) for k = 1,2,.... The compound Poisson 
model then appears as the natural continuous-time limit. In that sense the 
discrete-time model can also help to sharpen the intuition for the continuous- 
time set-up. 


Define as usual the claim surplus process {si by 
SO = u— R® (u) = AG —n. 
i=1 


The ruin time for (1.1) is defined as 
Ou) = min{n > 1: R® (u) < 0} = min{n>1: sO > u} 


n 


and the ruin probability as 
(d) = (a) (d) 
yp” (u) = P(r (u) < ov) = P (max S > u) 


(we follow here the tradition of the literature to consider the process ruined 
already if it reaches level 0, but only for some n > 1, so 4%(0) < 1). 


Proposition 1.2 The ruin probability for the discrete-time risk process (1.1) 
satisfies the recursion 


yD (u) = y (1— H(y)) $(u-y)+ 5° (1-Aly)), u=1,2,..., (1.2) 
y=0 y=u 


with starting value (0) = 7°29 (1 — H(y)) = E(X1). 
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This is obviously a discrete analogue of the renewal equation IV.(3.2), and be- 
comes clear at once if one conditions on the value y of the first (weak) ascending 
ladder point S+, of the claim surplus process, where 


T} = 790) =min{fn>1: S® >0} 
provided it has been shown that 
gi) = P(S, =y, T} <œ) = 1 = Ay). 


That gi = 1— H (y) can be proved by adapting the proof of Theorem III.5.1 
from continuous to discrete time. This is straightforward, which is intuitively 
plausible from the fact that the claim surplus processes have the common feature 
of being downward skipfree with unit drift when no claims occur. Alternatively, 
one may simply refer to the form of gt ) as a known result in random walk the- 
ory (e.g. [APQ, Cor. 5.6, p. 236] combined with the connection on [APQ, p. 222] 
cone strong and weak ladder points heights). We shall, however, also present 
a direct proof that avoids the slightly sophisticated probabilistic ideas of III.5 
or [APQ]. 

Proof. In the first time unit, the premium income is 1 and the risk reserve 
process will only survive if the total claim amount satisfies X, < u and will 
then start anew at the level u + 1 — Xj, (note that because of the independence 
assumption of the total claim amounts X1, X2,... the process is Markov). Hence 


PP) = SohebO(ut1—k) +1-H(u), 
k= 
fe 
= Ņ hug YG) +1- H(u), w=0,1,2,... (1.3) 
j=1 


From this it follows that for w = 0,1, 2,... 


w w utl w 
Nu) = YY hnt O + 5 (1-H), 
u=0 u=0 7=1 u=0 
ae w 
= 3 yp (j) 5 hu+1-j + ra (1 = 


u=j-1 


= Dv OG)H (w+ 1- j) +y (w+ 1)ho + (1- A(u)) 
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or equivalently 
wO(w+1)ho = psi? +L VO H(w+1-j))- >X (1- Hu). 
At the same time we can read off from (1.3) that 


pPO(w+ Iho = yP (w )— Sheds BOG) — 0 Hj), 


j=l 


and equating the last two equations gives 
dw) = $90)4+ > vOMA-Aw-/))-S>A-#W)). (14) 
j=l 


On the other hand, with gst ) as above, we clearly have 


O(a) = Fav DEI, wH1,2,.. 


and a 
vo) = F. 
y=0 
We can hence write for u = 1,2,... 
u-1 u-1 
PO) = Soh pO u—y)+ oO -Y 9 (1.5) 
y=0 y=0 
u u—l1 
d 
= J PpP + vO -— SY gh. (1.6) 
y=1 y=0 


Comparing (1.4) and (1.6) now establishes 


g =1-— H(y), y =0,1,2,... 


and a 
y (0) = F (1— Hy) = E(X1). 
y=0 


Inserting the latter formula in (1.5) now gives (1.2). 
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Remark 1.3 Note the complete analogy of the formulas for gh and w% (0) 
with the ones for the compound Poisson risk model in Chapter IV (this comes 
as no surprise, since as outlined in Remark 1.1 the latter may be interpreted as 
the continuous-time limit of the compound binomial model). 


Remark 1.4 The proof of Proposition 1.2 via random walk theory has, how- 
ever, the advantage of naturally giving the form of gh? for more general upward 
steps than 1 subject to some rootfinding. More precisely, one can handle the 


model 


RO(u) = ut > (Yn — Xn); neN, (1.7) 
i=1 


where the Y,, are i.i.d. with support in {0,...,r} for some r € N. For details, 
see [APQ, Sect. VIII.5a]. 

An appealing special case of (1.7) is Yp = r > 1. This allows the claims 
to be orders smaller than the premium inflow, which may appear more realistic 
than (1.1). 


Example 1.5 Consider the two-point distribution ho = 6 = 1 — hg for 6 > 1/2. 
Then we are in the situation of the Gambler’s ruin problem and the recursion 
(1.2) indeed corresponds to the one of Proof 1 of Proposition II.2.1 (with a = co) 
leading to Y® (u) = ((1- 0)/0)" as already given in IT.(2.3). 


Example 1.6 Assume geometrically distributed claim sizes with ho = p and 
hk = (1—p)(1—a)a*~! for k > 1 with 0 < p < 1 and a such that E(X) = 
(1—p)/(1—a) < 1. In this case the recursion (1.2) reduces, after a little algebra, 
to W)(u +1) = (a/p)p (u) leading to 


Consider now the finite-time ruin probability 
pO(u,t) = P(r O(u) <t), ten. 


Noting that 7 (u,1) = 1 — H (u), it follows from the Markov property that 


pO(u,t) = PO(uj1)+ >> hry (u+1-k,t—1) for allt =2,3,... (1.8) 
k=0 
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This bivariate recursion? can now be used to recursively calculate w(u,t) for 
fixed integer values of u and t, whenever the claim size probability function 
hy (k = 0,1,2,...) is given. Although this simple recursion is one of the reasons 
why the discrete-time model also has some popularity for approximating y(u, t) 
of continuous-time models (in particular when adding additional features like 
interest rates, investment and dividends in the model), one should note that 
in practical applications it can be very computer-intensive to implement and 
the result only gives the numerical values with no hold of sensitivities to model 
assumptions such as claim size parameters. 

There is also a natural analogue of the adjustment coefficient for this discrete 
set-up. Define y as the unique positive root of the equation 


teh (X1 1) =] 


if it exists. The reason for this definition is that in this way eT À RP] is a 
martingale and because R® 


and gives 


25 oo on {r® (u) z co}, Proposition II.3.1 applies 


Proposition 1.7 Assume that the adjustment coefficient y > 0 exists. Then 


(a) 

eT 
pO(u) = we SO. 
) [exp{—yR Oo ay} | rd) (u) < oo] 


In particular, the Lundberg inequality Y® (u) < e77®u holds. 


Notes and references An early reference for the compound binomial model (1.1) 
is Gerber [398]. Pollaczeck-Khinchine-type formulas for y(u) can be found in Gerber 
[404] and Shiu [802]. De Vylder & Goovaerts [302] investigate possibilities to speed 
up the recursive calculation (1.8). In particular they give error bounds for y(u, t) 
if the claim size distribution is truncated. Another representation of the recursion 
(1.8) is given in Willmot [885]. Other quantities like the time of ruin, the surplus 
prior to ruin and the deficit at ruin in the compound binomial model are e.g. studied 
in Cheng, Gerber & Shiu [235], Li & Garrido [588] and Liu & Guo [602]. In the 
compound binomial model the number of periods until a claim occurs is geometrically 
distributed with parameter ho. An extension is to allow for more general distributions 
for the number of interclaim time periods (which is the discrete-time analogue of the 
extension of a Poisson process to a renewal process). The resulting discrete-time Sparre 
Andersen model has received some interest recently; for a survey see Li, Lu & Garrido 
[593]. 


2Which is the discrete-time analogue of having an additional partial derivative w.r.t. time 
t in the integro-differential equation for 7 in the compound Poisson case, cf. Chapter V. 
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Cossette, Landriault & Marceau [260] extend the compound binomial model by 
assuming that the indicator random variable of whether X; is non-zero in period j 
follows a homogeneous Markov chain, for an extension to a Markov-modulated envi- 
ronment for both this indicator r.v. and the claim size distribution see [261] and Yang, 
Zhang & Lan [905]. Another dependence structure between subsequent claim sizes is 
considered in Yuen & Guo [907]. For the effect of dependent claims in a discrete-time 
model with continuous claim size distribution see e.g. Cossette & Marceau [259], Wu & 
Yuen [898] and Reinhard & Snoussi [732]. De Kok [286] deals with an inhomogeneous 
risk model. 

Dickson & Waters [318] use the recursions of the discrete-time model to effectively 
approximate finite-time ruin probabilities in the Cramér-Lundberg model, under ad- 
ditional force of interest see [320] and Brekelmans & De Waegenaere [200]. Egidio 
dos Reis [339] deals with moments of ruin and recovery times. For stochastic ordering 
concepts in the discrete framework and resulting ordering of ruin probabilities we refer 
to Denuit & Lefèvre [295]. 

As mentioned in the beginning of this section, one may take the viewpoint that 
observations and actions can in practice only happen at discrete points in time, but 
the underlying risk model has many computational and qualitative advantages when 
being continuous time. A possible bridge between these conflicting arguments can be to 
assume an underlying continuous time model and indeed only observe the risk process 
(and potential ruin) at discrete times. If these discrete time points are assumed to be 
random, e.g. exponentially distributed, this still leads to explicit formulas of continuous 
time flavor. By moving towards Erlang(n) (and hence more peaked) observation times 
with growing n, one approaches the discrete-time set-up with computational vehicle 
of continuous time models. This procedure is worked out in [20] and is close in spirit 
to the Erlangization approach for finite time horizon ruin probabilities as discussed in 
Section IX.8. For statistical inference issues for a continuous time risk model under 
discrete observations, see Shimizu [797]. 


2 The distribution of the aggregate claims 


We study the distribution of the aggregate claims A; = re U; at time t, 
assuming that the U; are i.i.d. with common distribution B and independent of 
N;,. In particular, we are interested in estimating P(A; > x) for large x. This 
is a topic of practical importance in the insurance business for assessing the 
probability of a great loss in a period of length t, say one year. Further, the 
study is motivated from the formulas in V.2 expressing the finite horizon ruin 
probabilities in terms of the distribution of A+. 


The main example is N; being Poisson with rate Gt. For notational simplicity, 
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we then take t = 1 so that 


Pn = P(N =n) = eee (2.1) 


However, much of the analysis carries over to more general cases, though we do 
not always spell this out. 


2a The saddlepoint approximation 


We impose the Poisson assumption (2.1) and define A = A4. Then Ee®4 = e*(™ 
where x(a) = 3(B[a] — 1). The exponential family generated by A is given by 


Pe(Aedz) = Ele’ n(@). A € da] . 


In particular, 


kola) = logEge®” = Kk(a+6)—K(0) = 36 ( Bola] — 1) 


where @ = 3B[6] and Bo is the distribution given by 


eft 
Bo(dz) = Be 


This shows that the P,-distribution of A has a similar compound Poisson form 
as the P-distribution, only with 8 replaced by Gg and B by Bo. 

The analysis largely follows Example XIII.1.1. For a given x, we define the 
saddlepoint 0 = 0(x) by EygA = x, i.e. K,(0) = K'(0) = x. 


Proposition 2.1 Assume that lim,t,« B" tr] =o, 


pl 

km 2 ir] = 0, 22 
A 3/2 

rîr* (B"(r]) 


where r* = sup{r : Bir] < co}. Then as x > 00, 


e 9x+K(8) 


04/27 6 Br [6] l 


P(A >z) ~ (2.3) 
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Proof. Since EgA = x, Vare(A) = #&”(0) = BB" (6), (2.2) implies that the 
limiting Pg-distribution of (A — x)/4/ 6B" [0] is standard normal. Hence 


P(A>x) = Egle GATE A SS r| =e meee has fe Oh). A z| 
—oe+n(0) [> -0V pB ou l o-?/2 
~ e e ——e d 
f V2 7 
e~ 8s+r(0) 


z > J o-26-2?/(20BB" I0) gz 
0,27 bBo) /° 

—0x+K(0) co e~ 9e+K (8) 

SS i a ef dz = an a 
04/27 BB" o] 1° 6y/ 2m BB"(6} 


It should be noted that the heavy-tailed asymptotics is much more straight- 
forward. In fact, just the same dominated convergence argument as in the proof 
of Theorem X.2.1 yields: 


Proposition 2.2 If B is suberponential and zN < oo for some z > 1, then 
P(A > x) ~ EN B(x). 


Notes and references Proposition 2.1 goes all the way back to Esscher [358], and 
(2.3) is often referred to as the Esscher approximation. 

The present proof is somewhat heuristical in the CLT steps. For a rigorous proof, 
some regularity of the density b(x) of B is required. In particular, either of the following 
is sufficient: 


A. b is gamma-like, i.e. bounded with b(x) ~ ycr te. 


B. b is log-concave, or, more generally, b(x) = q(x)e~"), where q(x) is bounded 
away from 0 and oo and h(x) is convex on an interval of the form [xo,z*) where 
x* = sup {x : b(x) > 0}. Furthermore f>° b(x)Sdx < co for some ¢ € (1, 2). 
For example, A covers the exponential distribution and phase-type distributions, B 
covers distributions with finite support or with a density not too far from e-* with 
a > 1. For details, see Embrechts et al. [347], Jensen [506] and references therein. For 
higher-order extensions of the asymptotic behavior in Proposition 2.2 see Albrecher, 
Hipp & Kortschak [29] and references therein. It is also shown there that the folklore 
use of the shifted asymptotics P(A > x) ~ 3 B(«—Gys) for Poisson N can be rigorously 
justified in the sense that, under mild additional assumptions on B, the shifting P(A > 
x) ~ EN B(x—pp(E(N*)/E(N)—1)) improves the asymptotic accuracy of Proposition 
2.2 by an order of magnitude. 

Asymptotic results for situations where the tail behavior of N determines the tail 
behavior of A are given in Asmussen, Kliippelberg & Sigman [87] and Robert & Segers 
[742]. 
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2b The NP approximation 
In many cases, the distribution of A is approximately normal. For example, 
under the Poisson assumption (2.1), it holds that EA = Bug, Var(A) = bp? 


and that (A — Bus) / (Bu)? has a limiting standard normal distribution as 
b — œ, leading to 


P(A >z) © 1- o(= ee) (2.4) 
Vj buy? 


The result to be surveyed below improves upon this and related approximations 
by taking into account second order terms from the Edgeworth expansion. 


Remark 2.3 A word of warning should be said right away: the CLT (and the 
Edgeworth expansion) can only be expected to provide a good fit in the center 
of the distribution. Thus, it is quite questionable to use (2.4) and related results 
for the case of main interest, large x. 


The (first order) Edgeworth expansion states that if the characteristic func- 
tion g(u) = Eel“Y of a r.v. Y satisfies 


Glu) x en /2(1 + idu3), (2.5) 
where ô is a small parameter, then 
P(Y <y) ~ Oy) —d(1—y") ply). (2.6) 


Note as a further warning that the r.h.s. of (2.6) may be negative and is not 
necessarily an increasing function of y for |y| large. 

Heuristically, (2.6) is obtained by noting that by Fourier inversion, the den- 
sity of Y is 


gy) = so f a(uyau 


QT fs 
we L [7 etue 2(1 + iĝu’) du 
27 Joo 


= gy) — ly? —3y)y(y), 


and from this (2.6) follows by integration. 

In concrete examples, the CLT for Y = Yj; is usually derived via expanding 
the ch.f. as 
u 


glu) = zei YY = exp 4 iukı i gor ae nee 
2 3 4! 
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where «1, K2,... are the cumulants; in particular, 


kı =EY, k2=Var(Y), x3 = E(Y — EY)’. 


Thus if EY = 0, Var(Y) = 1 as above, one needs to show that K3,K4,... are 
small. If this holds, one expects the u term to dominate the terms of order 
ut, uř,... so that 


u? u’ u? u’ 
glu) & ex {5 -itna x ex {-5} (1-15 x) 


so that we should take ô = —«3/6 in (2.6). 

Rather than with the tail probabilities P(A > x), the NP (normal power) 
approximation deals with the quantile a1—e, defined as the solution of P(A < 
yi-e) = 1— €. A particular case is a.995, which is often used as the VaR ( Value 
at Risk) for risk management purposes. 

Let Y = (A—EA)/,/Var(A) and let y1—e, 21-~ be the (1 — )-quantile in the 
distribution of Y, resp. the standard normal distribution. If the distribution of 
Y is close to N(0,1), y1—e should be close to z1—e (cf., however, Remark 2.3!), 
and so as a first approximation we obtain 


aize = EA+yi-<VVar(A) ~ EA + zı-eyYar(A). (2.7) 


A correction term may be computed from (2.6) by noting that the ®(y) terms 
dominate the 6(1 — y”)y(y) term. This leads to 


l-e œ (y_.) Sll -— yi_.)e(y1—e) 

Bye) — 5(1 — 21- p210) 

(z1 ¿) Fe (yı e> ¥1 Pla e) a ô(1 = 22) p(z1-e) 
= 1l-e€4+ (ye — Z1-e)p(21-e) — 6(1 — z7_e)p(21-) 


2 


2 


which combined with ô = —EY?/6 leads to 


1 
Yi-e = Z1-e + aca ee 1) vs 


Using Y = (A — EA)/,/Var(A), this yields the NP approximation 


(A — EA)’ 
Var(A)3/2 ` 


aize = EA +21 e(Var(4)) + F 1) 


(2.8) 
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Under the Poisson assumption (2.1), the kth cumulant of A is Bu and 


SO Kk = BUB (Bu H)". In particular, «3 is small for large G but dominates 
K4, K5, ... as required. We can rewrite (2.8) as 
(2)\1/2 1,2 Dee 
ai-e = uB + z1—e(bup ) seeped) (2.9) 


Vatu?) 


Notes and references We have followed largely Sundt [820]. Another main ref- 
erence is Daykin et al. [279]. Note, however, that [279] distinguishes between the NP 
and Edgeworth approximations. 


2c Panjer’s recursion 


Consider A = at U;, let pn = P(N = n), and assume that there exist con- 
stants a,b such that 


b 
Dn = (042) ma, WS TAD ees (2.10) 
n 


For example, this holds with a = 0, b = 8 for the Poisson distribution with rate 
GB since 


Proposition 2.4 Assume that B is concentrated on {0,1,2,...} and write gj = 


P(U; = j), j = 9,1,2,..., fi = P(A = j), j =0,1,.... Then fo = So JÖ Pn 
and 
1, A k 
f (a+) Ohtjaks: J= R 2ks (2.11) 
i 1 — ago > J í 
In particular, if go = 0, then 
k 
fo=po, fi = D (0408) adic. j= 1,2,.... (2.12) 
k=1 


Remark 2.5 The crux of Proposition 2.4 is that the algorithm is much faster 
than the naive method, which would consist in noting that (in the case go = 0) 


j 
Hi = X mg” (2.13) 
n=1 
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where g*” is the nth convolution power of g, and calculating the g;” recursively 
by g}! = gj, 
j-l 
* *(n-1 
Gea Se O (2.14) 
k=n-1 


Namely, the complexity (number of arithmetic operations required) is O(j?) for 
(2.13), (2.14) but only O(j2) for Proposition 2.4. 


Proof of Proposition 2.4. The expression for fp is obvious. By symmetry, 


| 


is independent of i = 1,...,n. Since the sum over i is na +b, the value of (2.15) 
is therefore a + b/n. Hence by (2.10), (2.13) we get for j > 0 that 


:=3] (2.15) 


Co 


fy = Plet peas" 

[a+ o> | yu = jpn 193" 
[ero Su = ip. 
Dheri Jon Prt 


k=0 


j 


(« + p£) yo = (a + 08) ou fy-% 
JS n=0 k=0 J 
J k 
= agf; +» (« + p£) Jk fj-k 5 
k=1 


and (2.10) follows. 


If the distribution B of the U; is non-lattice, it is natural to use a dis- 


crete approximation. To this end, let Chee U; (h) a ee rounded upwards, resp. 


downwards, to the nearest multiple of h and let AY =u, US”). An obvious 


i, 


modification of Proposition 2.4 applies to evaluate fi distribution F of AY), 
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letting f\") = P(AY? = jh) and 


g™ = PU) =kh) = B((k+1)h) - Blkh), k=0,1,2,..., 
gf) = PUL) =kh) = Blkh)—B((k-1h) = get, b= 1,2,... 


Then the error on the tail probabilities (which can be taken arbitrarily small by 
choosing h small enough) can be evaluated by 


SP Pees J p 
j=(|2/h] j=|2/h] 


Further examples (and in fact the only ones, cf. Sundt & Jewell [821]) where 
(2.10) holds are the binomial distribution and the negative binomial (in par- 
ticular, geometric) distribution. The geometric case is of particular importance 
because of the following result which immediately follows from combining Propo- 
sition 2.4 and the Pollaczeck-Khinchine representation: 


Corollary 2.6 Consider a compound Poisson risk process with Poisson rate (3 
and claim size distribution B. Then for any h > 0, the ruin probability y(u) 


satisfies 
lo) 


yo Pee O (2.16) 
j=Lu/h] j=Lu/k] 


where I) Die are given by the recursions 


(h ‘ 
aa ae j=1,2,... 


h h) , 
= ah (oe Oe ee 


90,— k=1 
starting from ee =1-p, a =(1-p)/(1-— pgs) and using 
1 peta 


gh = Bo((k + 1)h) — Bo(kh) = T B(x) dz, k=0,1,2,..., 
$ B 


gf" Me k=1,2,.... 


II 


g® = Bo(kh) — Bo((k — 1)h) 
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Remark 2.7 It is clear that the quotient of the upper and the lower bound 
in (2.16) tends to 1 for u — oo if U; is long-tailed (i.e. B(x — y)/B(x) — 1 
as x — oo for all y, cf. Chapter X). Correspondingly, in numerical implemen- 
tations one typically observes that the difference between the upper and lower 
bound in (2.16) gets larger for increasing u, but for long-tailed (in particular 
subexponential) U; again tends to zero for still larger u. 


Notes and references The literature on recursive algorithms related to Panjer’s 
recursion is extensive, see e.g. Dickson [307] and references therein. Recursion formu- 
las for counting distributions that are much more general than the class defined by 
(2.10) have been studied in the literature. A natural and very general class seems 
to be counting distributions that satisfy a finite-order homogeneous recursion with 
polynomial coefficients, see e.g. Wang & Sobrero [873]. For a survey that also covers 
multivariate extensions, see Sundt & Vernic [824]. Gerhold, Schmock & Warnung [416] 
provide an improved recursion algorithm; see also Hipp [467] for a speed-up from order 
O(j”) to order O(j) for phase-type claim size distributions. In recent years, due to the 
increasing available computer power the emphasis is gradually shifting towards direct 
numerical inversion of the moment-generating function of the aggregate claim size. In 
the context of discrete claim size distributions, Fast Fourier Transform techniques can 
be quite powerful (see Griibel & Hermesmeier [439, 440] for details and Embrechts & 
Frei [344] for a recent comparison). 


2d The distribution of dependent sums 


Whereas for the results in the previous subsections the independence assumption 
for the summands was crucial, in practice one will often face situations where 
information is needed about the tails of sums of dependent random variables. 
Clearly there are infinitely many possible dependence structures for a fixed set of 
marginal distributions and one cannot expect a complete picture of how depen- 
dence affects the behavior of the distribution tail. Nevertheless certain patterns 
occur and for the tail behavior it seems natural that only the dependence of the 
summands in the tail is important. In the sequel we will state some results in 
this direction (mainly) for the sum of two identically distributed subexponential 
r.v.’s to illustrate the challenges that occur when dependence enters. For more 
general results see the references in the Notes at the end of the section. 

Let us consider the sum X1 +X2 of two identically distributed subexponential 
random variables each with distribution function B. By definition, if X; and 
Xə are independent, then P(X, + X2 > x) ~ 2B(zx) as x — œ and a natural 
question is under which assumptions on the dependence structure of Xı and X2 
and on B the same asymptotic relation holds true with dependence. 

A first rough description of tail dependence between X, and Xə is the so- 
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called (upper) tail dependence coefficient 
A= Jim P(X2 >v|Xı >v). 
If A = 0, then Xı and Xə are called tail-independent. The following simple 
result extends Proposition X.1.1(a). 
Lemma 2.8 P(max(X1, X2) > x) ~ (2—A)B(z2). 
Proof. 


P(max(X1, X2) > x) = P(X, > x) +P(X2 > x) — P(X, > x£, Xə > x) 
B(x) + B(x) — B(x) P(X2 > z|Xı > 2). 


Let us first collect some results for regularly varying marginals, a case that 
is quite well understood. 
Regularly varying marginal distributions 
Proposition 2.9 Let B(x) ~ L(x)a~* with a >0. Then 
P(X, + Xo >2) ois eae 
lim sup ——~——*+"* ae (as + (2 20)#*7) » OSA 
00 B(x) Ba — A), 2<À 


Proof. Analogously to the proof of Proposition X.1.4, for any 0 < 6 < 1/2 we 
have 


P(X1+X2 > £) < P({X > (1-d)x}U{X2 > (1-d)@}U({X1 > da} N{X2 > éx})) 
< 2B((1 — 6)x) + P(X1 > ba, X2 > dx) — 2P(Xi > (1 — 4)x, X2 > (1 — ô)£) 


so that 
; P(X, + X2 > 2) A B((1— ôx) B(x) 
l L AT T Ko 2— 2A — A= 
Lo Be < limsup(2 -2 Bay. BO) ) 
_ 2-2 A 
= Gao. 68" 


Within the defined range of 6, this upper bound is minimized for 
—+—, 0<A<2/3 
= 1+(2-2) att 
1/2, 2/3<A<1, 
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which yields the result. 


This upper bound is sharp for both independence and comonotone depen- 
dence (in the latter case,? P(X, + X2 > £) ~ 2°B(z)). 
A combination of Lemma 2.8 and Proposition 2.9 immediately shows 


Corollary 2.10 If B has a regularly varying tail and A = 0, then P(X, +X: > 
x) ~ 2B(x) as x —> ov. 


Hence, for regularly varying distributions, tail independence is already a suffi- 
cient criterion to guarantee that the tail of the dependent sum behaves asymp- 
totically as if Xı and Xə were independent. 

An important and natural subclass of distributions with regularly varying 
marginals are the ones with multivariate regular variation (for consistency we 
only state the bivariate case, although the extension to n dimensions is obvious). 
A vector X = (X1, X2) is regularly varying with index —a < 0, if there exists a 
probability measure S$ on S} (the unit sphere in R? with respect to the Euclidean 
norm |- | restricted to the first quadrant) and a function b(x) — oo such that 


b- (2) P (5. =) :) -F Va X S, (2.17) 


in the space of positive Radon measures on ((e€, o0] x S}) for all € > 0, where 
a > 0 and valt, oo] = t7%, (t > 0,a > 0) (see e.g. Resnick [737]). S is often 
referred to as the spectral measure of X. 

The above implies in particular that on every ray from (0,0) into the positive 
quadrant (the direction of which is governed by S$), we have a regularly varying 
tail with index —a. Moreover, the tail of Xı + X2 is also regularly varying with 
the same index. 

For this specific dependence structure, the asymptotic behavior of the sum 
can be given explicitly in terms of the spectral measure. 


Proposition 2.11 Assume that X = (X1,X2) is exchangable and regularly 
varying with index —a < 0 and spectral measure S. Then 


fo? (cos Y + sin y)” S(dy) 
JE” (cose y + sin? y) S(dy) 


P(X, +X >x) ~ 2B(z) 


3For a < 1 (i.e. infinite mean) this also shows that comonotone dependence does not nec- 
essarily provide an upper bound for the tail asymptotics of all possible dependence structures 
with fixed marginals! Intuitively, if the marginal distribution tail is heavy enough, then the 
two random sources for a possibility of a large sum caused by one of the summands outweighs 
the effect of summing two large components from one random source. 
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Seri : = 1 
Proof. Consider in (2.17) the events |X|/x > t fort = Sohn, andt= = p 


where y € [0,7/2] denotes the angle corresponding to X/|X|. We then obtain 


m/2 
bl (a) P(X +X > x) > af (cosy + sin y)* S(dy) 
0 


and 


T/2 
bt(x) P(X > £) > af cos“ y S(dy), 
0 


so that the result follows from S(dy) = S(d(a/2— p)). 


It occurs in a number of situations that the risks X; are independent, but 
they need to be added with some weights that are not independent. Here is a 
result the proof of which can be found in Goovaerts et al. [424]. 


Proposition 2.12 Assume that X1,..., Xn are i.i.d. r.v.’s with regularly vary- 
ing tail B(x) ~ L(x)x~* for some a > 0 and let 61,...,0, be dependent 
non-negative r.v.’s, independent of X1,...,Xn. If there exists some 6 > 0 s.t. 
(2%?) < 00 for1 < k< n, then 


n 


P( max, Yous > 2) ~ PÒ J Xe > 2) ~ Ble) JECO). 
k=1 


k=1 
If either 
0<a<l, > U(82*°) < 00, 5 1090 < oo for some d > 0 
k=1 k=1 
or 
a21, ŅO(E0Rt)) VC) <o, S [Eog] Ct < oo for some 5 > 0, 
k=1 k=1 
then 


P( m max D > x) N P(S 6(Xx)* > x) X Be) 5 (02). 
k=1 k 


=1 


Example 2.13 Recall the discrete time risk model with stochastic investment 
of Section VIII.5. If we choose 6; = An -| =A? and X; = Bx, then Propo- 
sition 2.12 applies for the case of regularly varying insurance risk Bẹ (with 
index —a). The conditions of the Proposition translate into E(A;°*°) < oo 
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and E(A; atoj < 1, respectively, for some ô > 0. Under these assumptions, 
Proposition 2.12 gives the finite-time ruin probability 


(47°) (1 - (E(47°))”) 


Y(u,n) = P(r(u) < n) ~ B(z) TIa , u= 
1—E(A;*) 
and the infinite-time ruin probability 
oo (AT °) 
plu) = P(r(u) < œ) ~ Bla) — a; U>, 
1- E(A7®°) 


which refines Theorem VIII.5.8 for this particular case. 


Other subexponential marginal distributions 


From the proof of Proposition 2.9, it becomes clear that Corollary 2.10 also 
holds true for any B € Z with heavier tail than regularly varying. On the 
other hand, in general the marginal tails cannot be much lighter than regularly 
varying in order to dominate the ‘dependence effect’ in the tail of the sum given 
A = 0, as the following result shows. 


Proposition 2.14 If the mean excess function e(x) is self-neglecting, i.e. 


lim =1 Va>0, (2.18) 
r= 00 e(x) 
and if 
inf lim inf P(X > ae(x) | Xı > x) >0, (2.19) 
then ie sa 
lim inf Piri 2) = 00 


Proof. From Proposition X.1.18 we know that the self-neglecting property (2.18) 
implies 


_ B(x+ae(z)) ao 
ea 
and we have 
B(x) N B(x + ae(z)) 
B(x — ae(zx)) B(a + ae(x) — ae(x + ae(z))) 
B(x + ae(z)) 
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Together with (2.19) this gives 


P(X; +X2 >£) > P(X, >z- ae(x), X2 > ae(z)) 
= P(X, > x-ae(z)) P(X > ae(z) |X > x — ae(z)) 
~ P(X, >2x-ae(z)) P(X2 > ae(x)| X1 > x) 

P(X; > x — ae(z)) 

P(X, > x) e" 


IV 
M 


2 
m 


for some £ > 0 and any a > 0. Hence 


iam eet es) > ce 


and the latter is unbounded for a — o. 


Remark 2.15 Recall from Chapter X that condition (2.18) is satisfied for 
Weibull and lognormal distributions (more generally, for all subexponential dis- 
tributions which lie in the maximum domain of attraction of the Gumbel distri- 
bution, cf. X.6b). A sufficient condition for (2.19) to hold is 


liminf P(X2 > e*(x)|X1>2) > 0 


for any e*(x) with e*(x)/e(x) — oo. One can show that for all B that satisfy 
(2.18) there exists a dependence structure such that (2.19) is satisfied (cf. [13]). 


On the other hand, for a particular given dependence structure the tail of 
the sum may well be asymptotically equivalent to the one of the independent 
sum. This is illustrated by the following example with lognormal marginals and 
a Gaussian copula (which is tail independent). 


Proposition 2.16 Let Yı, Yə be bivariate normal with the same mean u, the 
same variance a? and covariance p € [-1,1). Then, for Xı =e™ and Xo =e”? 
one has 


2/4 
P(X, + X2 > x) ~2P(X >x) ~ / exp {—(log x — u)? /20°} . 
a log x 
Proof. Rather than giving a rigorous technical proof (for which we refer to As- 
mussen & Rojas-Nandaypa [96]), we give here just a short heuristical argument 
supporting the result. Take u = 0, 0? = 1, p > 0 for simplicity. Then we can 
write 


Yy=U4+YV,, Yo = U+%V2, 
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where U, Vi, V2 are independent univariate Gaussian with mean zero and vari- 
ances a”, b?, b?, respectively, where a? + b? = 1, a? = p. Given U = u, X; and 
Xə are independent lognormals with log-variance b?, so by subexponential limit 
theory 


P(X; + X2>2|U=u) = P(e“ +e > ze“) 
V2 
a a J exp {—(log x — u)”/2b7} . 


We make the guess 


P(X, +X >a) ~ max eo 20"P(X,+X2>a\U=u) (2-20) 


u ay2r 
and ignore everything not in the exponent and constants. Then we have to find 
the u minimizing 
u ulog x u 

2a? b2 2b2 
which (using a? + b? = 1) is easily seen to be u = a? log x. Substituting back in 
(2.20), we get 


P(Xi+X2>2) ~ exp{—a'*log?z/2a? — (1-7)? log? x/2b°} 
exp {— log? x/2} (2.21) 


in agreement with the claimed assertion (here ~ is used to indicate asymptotics 
at a rough level, i.e. rougher than ~ or even logarithmic asymptotics as used in 
large deviations theory). Note that the argument contains some information on 
how X, + Xə exceeds z: U must be approximately u = a? logx = plogx and 
either V; or V2 but not both large. Translated back to X1, X2, this means that 
one is larger than x and the other of order e“ = x. 


II 


We finish this section with a fairly general result of Foss & Richards [367 
about conditions under which the tail of the dependent sum asymptotically be- 
haves as the tail for the independent sum. Note that a consequence of Proposi- 
tion X.1.5 is that for B € Z there always exists a monotone function h(x) 7 co 
with 

lim B(x — h(x))/B(x) = 1. (2.22) 


Theorem 2.17 Let B € Z. Assume that Xı, Xə2,... are positive r.v.’s with 
c.d.f. Bi in a probability space (Q, F,P) such that for each i, Bi(x) ~ ciB(£) 
(with at least one ci 4 0 and 3c > 0 and zo > 0 s.t. Bi(x) < cB(x) for all 
x > xo). Further 
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(i) X1, X2,... are conditionally independent given a o-algebra GC F 


(ii) for each i there exists a non-decreasing function r(x) and an increasing 
collection of sets Ji(x) € Y with J;(x) > Q as x > œ such that 


P(X; > 2|AI(i(x)) < r(x)B(z)I(Ji(z)) as. 


and such that for a function h(x) that satisfies (2.22), uniformly in i, 


2. r(x)B(h(x)) = o(1), 
3. r(x) Jui Ble — y) Bay) = o(B(x)), 


Then for all n € N, P(X, +--+ Xn > £) ~ nB(a). 
Proof. Consider first Xı + X2. We have the inequalities 


P(X, + X2 >£) < P(X, >x — h(x)) + P(X2 > x — h(x)) 
+ P(h(2) < Xı < x- h(x), X2 >a - Xı) 


and 
P(X: +X > x) = P(X, > x) +P(X2 > x) — P(X > z£, Xə > x). 
Now, if Y is another r.v. with c.d.f. B, independent of X1, X2, 


P(h(x ea ine 
= [eeh 2) < Xı < x — h(a), X2 > z- X1 )19)] 


a (E h(x) P(X: € dy (JPX > x -yA |I 2(«—y)) + 1a -»))]) 


(x) 


< r(2)E( I _ P(X: € dy|MP(Y > x- y)) +E(I(J2(A(e)))) 


x—h(x) r = = 
2 ræ) f P(X, € dy)B(a — y) + 0(B(e)) = BE). 
h(x) 
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At the same time 


P(X, >z, Xə > x) 


(e0 > £, Xo >T | G) (1(2(2)) + 1ate)) 


IA 


(P(X > 2|YP(X2 > zA) + EI(Jo(2)) 
< r(z)B(x)P(X1 > £) +0(B(z)) = o(B(2)). 


Consequently, P(X, + X2 > x) ~ P(X, > «)+P(X2 > x). Since w.Lo.g. c > 0, 


P(X, + Xo > x) ~ (c TE c2)B(x). 


The result for general n now follows by induction. 


The following extension of Lemma X.1.8 is proved in [367]. 


Lemma 2.18 Under the conditions of Theorem 2.17, for any e > 0 there exist 
V(e) > 0 and xp = x0(€) such that for any x > xo andn > 1 


P(X, + +Xn >x) < V(e\(1+.6)"B(a). 


This result and Theorem 2.17 together with dominated convergence now gives 
the following extension of Lemma X.2.2. 


Proposition 2.19 Let K be an independent integer-valued r.v. with Ez% < 
co for some z > 1. Under the assumptions of Theorem 2.17 one then has 
P(X ++ Xr > a) ~ EOE, c) Ba). 


In several applications in risk theory, conditionally independent r.v. will be an 
appropriate description of the dependence structure. However, the challenge in 
the application of the above result is to identify a o-algebra Y and a correspond- 
ing function h(x) that satisfies the assumptions of Theorem 2.17. See [367] for 
some worked out examples. 


Notes and references Some general non-asymptotic bounds on P(X1+---+Xn > 
x) are derived in Denuit, Genest & Marceau [293], Cossette, Denuit & Marceau [257], 
Mesfioui & Quessy [635] and Embrechts & Puccetti [350] (see also [351] for bounds on 
functions of multivariate risks). Worst-case scenarios are also studied by Riischendorf 
[759]. Parts of the material in this section is from Albrecher, Asmussen & Kortschak 
[13]. It is of course also possible to represent results on the asymptotic behavior 
of the sum through conditions on the underlying copula. Quite explicit results for 
Archimedean copulas can be found in Alink, Loewe & Wuethrich [42], see also [43]; 
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for extensions using multivariate extreme value theory see Barbe, Fougéres & Genest 
[130] and, with an emphasis on non-identically distributed marginals, Kortschak & 
Albrecher [556]. For further results on asymptotically independent subexponential 
risks in the maximum domain of attraction of the Gumbel distribution, see Mitra & 
Resnick [647] and also Laeven, Goovaerts & Hoedemakers [571] with a view towards 
actuarial applications. 

Tang & Wang [832] extend Proposition 2.12 to random variables with dominated 
variation. Asymptotic tail probabilities for negatively associated sums of heavy-tailed 
random variables are investigated in Wang & Tang [871] and Geluk & Ng [392]. 


3 Principles for premium calculation 


The standard setting for discussing premium calculation in the actuarial litera- 
ture does not involve stochastic processes, but only a single risk X > 0. By this 
we mean that X is ar.v. representing the random payment to be made (possibly 
0). A premium rule is then a [0, co)-valued function H of the distribution of X, 
often written H(X), such that H(X) is the premium to be paid, i.e. the amount 
for which the company is willing to insure the given risk. 

Among the standard premium rules discussed in the literature (not neces- 
sarily the same which are used in practice!) are the following: 


The net premium principle H(X) = EX (also called the equivalence prin- 
ciple). As follows from the fluctuation theory of r.v.’s with finite mean, 
this principle will lead to ruin if many independent risks are insured. This 
motivates the next principle, 


The expected value principle H(X) = (1+ 7)EX where 7 is a specified 
safety loading. For 7 = 0, we are back to the net premium principle. 
A criticism of the expected value principle is that it does not take into 
account the variability of X. This leads to 


The variance principle H(X) = EX + nVar(X). A modification (motivated 
from EX and Var(X) not having the same dimension) is 


The standard deviation principle H(X) = EX + 7,/Var(X). 


The principle of zero utility. Here v(x) is a given utility function, assumed 
to be concave and increasing with (w.lo.g) v(0) = 0; v(x) represents the 
utility of a capital of size x. The zero utility principle then means v(0) = 

iu (H (X)-X ) or, taking into account the initial reserve u in the portfolio, 


v(u) = Ev(u+ A(X) -— X). (3.1) 
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By Jensen’s inequality, v(u + H(X) —- EX) > Ev(u+ H(X)- X) = 0 so 
that H(X) > EX. For v(x) = x, we have equality and are back to the net 
premium principle. There is also an approximate argument leading to the 
variance principle as follows. Assuming that the Taylor approximation 


v(u+ H(X)- X) x u+) (HX) — X) +2 (a) - x)? 


is reasonable, taking expectations leads to the quadratic equation 


v” H(X)? + H(X)(20' — 2u"EX) + v"EX? — WEX = 0 


(with v’,v” evaluated at u) with solution 


(x) - Var(X) = (= = Hvar) = (Eva) 


v 
If v/v’ is small, we can ignore the last term. Taking +./- then yields 


v” (u) 
2v' (u) 


H(X) x EX— 


VarX; 


since v” (u) < 0 by concavity, this is approximately the variance principle. 


The most important special case of the principle of zero utility is 


The exponential principle which corresponds to v(x) = (1 — e~%”)/a for 
some a > 0. Here the initial capital u cancels out and (3.1) leads to 


1 
H(X) = z 108 ie 


Since m.g.f.’s are log-concave, it follows that H,(X) = H(X) is increasing 
as function of a. Further, lima)o Ha( X) = EX (the net premium principle) 
and, provided b = esssupX < oœ, limg+x.Ha(X) = b (the premium 
principle H(X) = b is called the maximal loss principle but is clearly not 
very realistic). In view of this, a is called the risk aversion. 

Note that in the compound Poisson model, the premium collected for 
the aggregate risk A, is pt. Equating this with H(A;) = + log Ze%^t leads 
to the Lundberg equation for a. Hence, the premium principle in the 
Cramér-Lundberg model can be interpreted as an exponential principle 
with risk aversion y, given the adjustment coefficient y > 0 exists. 
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H(X) = [sex > x)) dx 


for a fixed nondecreasing and left-continuous function g : [0,1] — [0,1] 
(also called the distortion function) such that g(0) = 0 and g(1) = 1. 


The percentile principle Here one chooses a (small) number a, say 0.05 or 
0.01, and determines H(X) by P(X < H(X)) = 1 — a (assuming a con- 


tinuous distribution for simplicity). 


Some standard criteria for evaluating the merits of premium rules are 


1. n>0,ie. H(X) > EX. 


2. H(X) < b when b (the ess sup above) is finite 

3. H(X +c) = H(X) +c for any constant c 

4. H(X +Y) = H(X)+ H(Y) when X,Y are independent 

5. H(X) = H(H(X|Y)). For example, if X = Er U; is a random sum with 


the U; independent of N, this yields 


N 
a(S n) = H(H(U)N) 


(where, of course, H(U) is a constant). 


Note that H(cX) = cH(X) is not on the list! Considering the examples above, 
the net premium principle and the exponential principle can be seen to be the 
only ones satisfying all five properties. The expected value principle fails to 
satisfy, e.g., 3), whereas (at least) 4) is violated for the variance principle, the 
standard deviation principle, and the zero utility principle (unless it is the ex- 
ponential or net premium principle). For more detail, see e.g. Gerber [398] or 
Sundt [820]. 


Notes and references The discussed premium principles are standard and can 
be found in many texts on insurance mathematics, e.g. Gerber [398], Heilmann [458] 
and Sundt [820]. For an extensive treatment, see Goovaerts et al. [423]. In recent 
years, the discussion about which criteria H should or should not fulfill in various 
applications has experienced enormous interest and activity in related finance contexts 
under the terminology of risk measures, see for instance Pflug & Romisch [697] for 
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an overview. On the insurance side, going from the static pricing framework above 
towards a dynamic one is considered to be an important step for many situations. 
Time consistency and market consistency play a crucial role in this context; for some 
recent developments see e.g. Cheridito, Delbaen & Kupper [238], Jobert & Rogers 
[507], Malamud, Trubowitz & Wiithrich [624] and Pelsser [690]. 


4 Reinsurance 


Reinsurance means that the company (the cedent) insures a part of the risk at 
another insurance company (the reinsurer). 

Again, we start by formulating the basic concepts within the framework of 
a single risk X > 0. A reinsurance arrangement is then defined in terms of a 
function h(x) with the property 0 < h(x) < x. Here h(x) is the amount of the 
claim x to be paid by the reinsurer and x — h(x) the amount to be paid by the 
cedent. The function x — h(x) is referred to as the retention function. The most 
common examples are the following two: 


Proportional reinsurance h(x) = 0x for some 0 € (0,1). Also called quota 
share reinsurance. 


Stop-loss reinsurance h(x) = (x—b)* for some b € (0,00), referred to as the 
retention limit. Note that the retention function is x A b. 


Concerning terminology, note that in the actuarial literature the stop-loss trans- 
form of F(a) = P(X < x) (or, equivalently, of X), is defined as the function 


b > E(X-b" = f @-9F Can) = f Foa 


An arrangement closely related to stop-loss reinsurance is excess-of-loss reinsur- 
ance, see below. 

Stop-loss reinsurance and excess-of-loss reinsurance have a number of nice 
optimality properties. The first we prove is in terms of maximal utility: 


Proposition 4.1 Let X be a given risk, v a given concave non-decreasing utility 
function and h a given retention function. Let further b be determined by E(X — 
b)* = Eh(X). Then for any x, 


w(x —[X —A(X)]) < Ev(x -X ^b). 


Remark 4.2 Proposition 4.1 can be interpreted as follows. Assume that the 
cedent charges a premium P > EX for the risk X and is willing to pay Pı < P 
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for reinsurance. If the reinsurer applies the expected value principle with safety 
loading 7, this implies that the cedent is looking for retention functions with 
th(X) = Po = P,/(1+7). The expected utility after settling the risk is thus 


w(u + P — Pi — [X —h(X))) 


where u is the initial reserve. Letting z = u + P — Pı, Proposition 4.1 shows 
that the stop-loss rule h(X) = (X — b)* with b chosen such that E(X — b)* 
= P maximizes the expected utility. 


Recall the notions of stochastic ordering from Section IV.8. For the proof of 
Proposition 4.1, we shall need the following lemma: 


Lemma 4.3 (OHLIN’S LEMMA) Let Xı, X2 be two risks with the same mean, 
such that 
Fi(z)< F(z), <b, F(x) > H(z), x >b 


for some b where F;(a) = P(X; < x). Then Xı xex X2. 


Proof. Define A(u) = E(X2 — u)t — E(Xı — u)*. Clearly A(0) = 0 and 
limy—oo A(u) = 0. But from the representation A(u) = {°° (F(x) — Fo(2)) da, 
we have under the given assumptions that A(w) increases on (0, b) and decreases 

n (b,co). So A(u) > 0 for all u > 0, ie. Xi <icx X2. Since E[LX1] = ELX9], 
this implies X1 <ex X2. 


Proof of Proposition 4.1. It is easily seen that the assumptions of Ohlin’s lemma 
hold when Xı = X Ab, Xp = X — h(X); in particular, the requirement EX, 
= EX; is then equivalent to E(X — b)* = Eh(X). Now just note that —v is 
convex. 


We now turn to the case where the risk can be written as 
N 
X=S°U; (4.1) 


with the U; independent; N may be random but should then be independent 
of the U;. Typically, N could be the number of claims in a given period, say a 
year, and the U; the corresponding claim sizes. A reinsurance arrangement of 
the form h(X) as above is called global; if instead h is applied to the individual 
claims so that the reinsurer pays the amount J’, A(U;), the arrangement is 
called local.4 


4More generally, one could consider DN hi(U;). 
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The following discussion will focus on maximizing the adjustment coefficient. 
For a global rule with retention function h*(x) and a given premium P* charged 
for X — h*(X), the cedent’s adjustment coefficient y* is determined by 


1 = Eexp{y*[X — h*(X) - P*}}, (4.2) 


for a local rule corresponding to h(u) and premium P for X — DDAN h(U;), we 
look instead for the y solving 


1= T [v-au] - P|} = rexp{[x-P- Soa] }. (4.3) 


i=l 


This definition of the adjustment coefficients is motivated by considering ruin 
at a sequence of equally spaced time points, say consecutive years, such that N 
is the generic number of claims in a year and P, P* the total premiums charged 
in a year, and referring to the results of VI.3a. The following result shows that 
if we compare only arrangements with P = P*, a global rule is preferable to a 
local one. 


Proposition 4.4 To any local rule with retention function h(u) and any 


P > |x = È h(U:)| ! (4.4) 


there is a global rule with retention function h*(x) such that 


N 
ih* (X) = DD AUi) (4.5) 


and y* > y. 
Proof. Define 


N 
h*(x) = [S ae) 


then (4.5) holds trivially. Applying the inequality Ey(Y) > Ey(E(Y|X)) (with 
y convex) to y(y) = e, Y = eS [Ui — h(U;)] — P, we get 


l= remo o> [u; - aU] - P|} > Eexp{y|X — h*(X) — P]}. 


But since y > 0, 7* > 0 because of (4.4), this implies y* > +. 
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Remark 4.5 Because of the independence assumptions, expectations like those 
in (4.3), (4.4), (4.5) simplify a lot. Assuming for simplicity that the U; are i.i.d., 
we get EX = EN - EU, 


s[x—oawy] = ENBU -0 


i=1 
N F 
exp{7[ U: -AU - P|} = ECHI®, (4.6) 
i=1 j 
where Cl] = EeY¥-h()) and so on. 


The arrangement used in practice is, however, as often local as global. Local 
reinsurance with h(u) = (u — b)* is referred to as excess-of-loss reinsurance and 
plays a particular role: 


Proposition 4.6 Assume the U; are i.i.d. Then for any local retention function 
u—h(u) and any P satisfying (4.4), the excess-of-loss rule hi(u) = (u — b)* 
with b determined by 


(U — b)" = EA(U) (4.7) 
(and the same P) satisfies qı > +. 


Proof. As in the proof of Proposition 4.4, it suffices to show that 


resp{a[S unb- PI} <1 = Bexw{y[S ie Mua] - P]}. 


i=l 


or, appealing to (4.6), that Ci ly] < Ch] where Clr] = Ee%(U^), This follows 
by taking Xı = U A^ b, Xə = U — h(U) (as in the proof of Proposition 4.4) and 
g(x) = e?” in Ohlin’s lemma. 


Notes and references Reinsurance is a classical topic. The material presented 
here is standard and can be found in many texts on insurance mathematics, e.g. Bowers 
et al. [195], Heilmann [458] and Sundt [820]. See further Hesselager [461] and Dickson 
& Waters [319]. The original reference for Ohlin’s lemma is Ohlin [671]. 

An early reference for minimization of the ruin probability through reinsurance 
in an asymptotic sense by maximizing the adjustment coefficient is Waters [876], see 
Hald & Schmidli [446], Centeno [224, 225] and Guerra & Centeno [441] for more recent 
extensions. The identification of optimal reinsurance strategies under various objective 
functions and constraints is an active field of research, see e.g. Centeno & Simões [226] 
and Albrecher & Teugels [38] for a recent overview. 

For optimal dynamic reinsurance in discrete time, see e.g. Dickson & Waters [321]. 
Optimal adaptive reinsurance strategies in continuous time are discussed in Chapter 
XIV. 


Appendix 


A1 Renewal theory 


la Renewal processes and the renewal theorem 


By a simple point process on the line we understand a random collection of 
time epochs without accumulation points and without multiple points. The 
mathematical representation is either the ordered set 0 < To < Ti < ... of 
epochs or the set Y1, Y2,... of interarrival times and the time Yo = Tọ of the 
first arrival (that is, Yn = Ta — Tn-1). The point process is called a renewal 
process if Yo, Y;,... are independent and Yj, Yo,... all have the same distribution, 
denoted by F in the following and referred to as the interarrival distribution; 
the distribution of Yo is called the delay distribution. If Yo = 0, the renewal 
process is called zero-delayed. The number max k : Tk—ı < t of renewals in [0, t] 
is denoted by N;. 

The associated renewal measure U is defined by U = Jọ F*" where F*" 
is the nth convolution power of F. That is, U(A) is the expected number of 
renewals in A C R in a zero-delayed renewal process; note in particular that 
U({0}) =1. 

The renewal theorem asserts that U(dt) is close to dt/u, Lebesgue measure 
dt normalized by the mean u of F, when t is large. Technically, some condition 
is needed: that F is non-lattice, i.e. not concentrated on {h,2h,...} for any 
h>0. Then Blackwell’s renewal theorem holds, stating that 


U(t +a) —U(t) at 00 (A.1) 


(here U(t) = U((0, t]) so that U(t+ a) — U(t) is the expected number of renewals 
in (t, t+a]). If F satisfies the stronger condition of being spread-out (F*” is non- 
singular w.r.t. Lebesgue measure for some n > 1), then Stone’s decomposition 
holds: U = U; + U2 where Uj is a finite measure and U2(dt) = u(t) dt where 
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u(t) has limit 1/ as t > oo. Note in particular that F is spread-out if F has a 
density f. 

A weaker (and much easier to prove) statement than Blackwell’s renewal 
theorem is the elementary renewal theorem, stating that U(t)/t —> 1/u. Both 
results are valid for delayed renewal processes, the statements being 


N, 1 
iN(t+a)—EN(t) > = resp. E= > Ł. 
u 


1b Renewal equations and the key renewal theorem 


The renewal equation is the convolution equation 
Z(u) = z(u)t+ | Z(u— x)F(dz), (A.2) 
0 


where Z(u) is an unknown function of u € [0,00), z(u) a known function, and 
F (dx) a known probability measure. Equivalently, in convolution notation Z = 
z+Fx*Z. Under weak regularity conditions (see [APQ, Ch. IV]), (A.2) has the 
unique solution Z = U > z, i.e. 


Z(u) = ih z(x)U (dz). (A.3) 
0 

Further, the asymptotic behavior of Z(u) is given by the key renewal theorem: 
Proposition A1.1 If F is non-lattice and z(u) is directly Riemann integrable 
(d.R.i.; see [APQ, Ch.IV]), then 
ha, (A.4) 
If F is spread-out, then it suffices for (A.4) that z is Lebesgue integrable with 
lims z(x) = 0. 


In IV.9, we shall need the following less standard parallel to the key renewal 
theorem: 


Proposition A1.2 Assume that Z solves the renewal equation (A.2), that z(u) 
has a limit z(0o) (say) as u— œ, and that F has a bounded density.‘ Then 


u HF 


, u>. (A.5) 


1This condition can be weakened considerably, but suffices for the present purposes. 
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Proof. The condition on F implies that U(dx) has a bounded density u(x) with 
limit 1/up as x — oo. Hence by dominated convergence, 


Ziu) = 1 Pe eee c= we [= UU 
w= If u-ads = f z(u- p)u(ue) at 
= [ £9) —dt = zleo) 
0 HF HF 


In risk theory, a basic reason that renewal theory is relevant is the renewal 
equation ITI.(3.2) satisfied by the ruin probability for the compound Poisson 
model. Here the relevant F does not have mass one (F is defective). However, 
asymptotic properties can easily be obtained from the key renewal equation by 
an exponential transformation also when F'(dx) does not integrate to one. To 
this end, multiply (A.2) by e7” to obtain Z=2+F *Z where Z(x) = e Z(z), 
2a) =e! z(x), F(dx) = e7 F (dz). Assuming that y can be chosen such that 
IMi e? F(dx) = 1, ie. that F is a probability measure, results from the case 


J F (dx) = 1 can then be used to study Z and thereby Z. This program has 
been we out in IV.5a. Note, however, that the existence of y may fail for 
heavy-tailed F. 


lc Regenerative processes 


Let {Tn} be a renewal process. A stochastic process {X;},.9 with a general 
state space E is called regenerative w.r.t. {Tn} if for any k, the post-Tẹ process 
{X7,+t}:59 is independent of To, T1, ..., Tp (or, equivalently, of Yo, Yi,..., Yk), 
and its distribution does not depend on k. The distribution F of Y1, Yo,... is 
called the cycle length distribution and as before, we let u denote its mean. We 
let Po, Eo etc. refer to the zero-delayed case. 

The simplest case is when {X+} has i.i.d. cycles. The kth cycle is defined 
as {XT,+t}o<t<yp,1) this expression is to be interpreted as a random element 
of the space of all E-valued sequences with finite lifelengths. The property 
of independent cycles is equivalent to the post-T, process {X7,4+};59 being 
independent of To,71,..., Tk and {Xt}oce <t, For example, this covers discrete 
Markov chains where we can take the Tn as the instants with X, = i for some 
arbitrary but fixed state i, or many queueing processes, where the T, are the 
instants where a customer enters an empty system (then cycles = busy cycles). 
However, the present more general definition is needed to deal with say Harris 
recurrent Markov chains. 

A regenerative process converges in distribution under very mild conditions: 
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Proposition A1.3 Consider a regenerative process such that the cycle length 


distribution is non-lattice with u < oo. Then X, 2 Xoo where the distribution 
of Xæ is given by 


Yı 
WX = E ! EA (A.6) 


If F is spread-out, then X, > Xœ in total variation. 


1d Cumulative processes 


Let {Tn} be a renewal process with i.i.d. cycles (we allow a different distribution 
of the first cycle). Then {Z;},.9 is called cumulative w.r.t. {Tn } if the processes 


{Z7,, +t g ZT, eee ee 


are iid. for n =1,2,.... An example is Z; = As f(X;)ds where {X;} is regen- 
erative w.r.t. {Tn}, with iid. cycles. This is the case considered in [APQ, VI.3], 
but in fact, just the same proof as there carries over to show: 


Proposition A1.4 Let {Z:};>o be cumulative w.r.t. {Tn}, assume that u < œo 
and define Un = Zr... — ZT, . Then: 
(a) If 


J sup |Zr+t — ZT | < œ, 
0<t<¥y 


then Z,/t “3 EU,/p; 
(b) If in addition Var(U1) < œœ, then (Zı — HEU, /p)/Vt has a limiting normal 
distribution with mean 0 and variance 


2 
J 2E 
Var(Uı) + (=) Var(Yı) — T Coh, Yi) : 


le Residual and past lifetime 


Consider a renewal process and define (t) as the residual lifetime of the re- 
newal interval straddling t, ie. €(t) = inf{T,-—t:t<T,}, and y(t) = 
sup {t—T,: t < Tk} as the age. Then {€(t)}, {n(t)} are Markov with state 


spaces (0,00), resp. [0,00). If u = œ, then €(t) EA co (i.e. P(E(t) < a) > 0 


for any a < oo) and n(t) = oo. Otherwise, under the condition of Blackwell’s 
renewal theorem, (¢) and 7(t) both have a limiting stationary distribution Fo 
given by the density F(x)/u. We denote the limiting r.v.’s by €,7. Then it 


holds more generally that (n(t), E(t)) 2 (n, £), and we have: 
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Theorem A1.5 Under the condition of Blackwell’s renewal theorem, the joint 

distribution of (n,€) is given by the following four equivalent statements: 
tf 

(@)Pa@>ae>y = I| Fede 


x+y 
(b) the joint distribution of (n,€) is the same as the distribution of (VW, (1 — 


V)W) where V,W are independent, V is uniform on (0,1) and W has distribu- 
tion Fw given by dFw/dF (x) = «/pur; 
(c) the marginal distribution of n is Fo, and the conditional distribution of € 


given n = y is the overshoot distribution Fw) given by FY) (2) = Fo(y+z)/Fo(y); 
(d) the marginal distribution of € is Fo, and the conditional distribution of n 
given € = z is FY), 

The proof of (a) is straightforward by viewing {(n(t), €(t))} as a regenerative 
process, and the equivalence of (a) with (b)-(d) is an easy exercise. 


In V.4, we used: 


a.s. 


Proposition A1.6 Consider a renewal process with p < oo. Then &(t)/t = 0 
and, if in addition EY < co, Eé(t)/t > 0. 


Proof. The number N, of renewals before t satisfies N;/t S ju. Hence for t 
large enough, we can bound ¿(t) by M(t) = max{Y,: k < 2t/y}. Since the 
maximum Mp of n i.i.d. r.v.’s with finite mean satisfies M,,/n S 0 (Borel- 
Cantelli), the first statement follows. For the second, assume first the renewal 
process is zero-delayed. Then Epé(t) satisfies a renewal equation with z(t) = 
Yi — t; Yı > t]. Hence 


wlt) = f U(ay)2(t—y) = i U(t- dy)z(y) < c S` z(k) 


where c = sup, U(x + 1) — U(x) (c < œ because it is easily seen that U(x + 
1) — U(x) < U(1)). Since z(k) < E[Y1; Yı > t] — 0, the sum is o(t) so that 
'06(t)/t — 0. In the general case, use 


ne(t)/t = E[Yo — t; Yo > 0] + | E(t — y)P(% € dy). 


1f Markov renewal theory 


By a Markov renewal process we understand a point process where the interar- 
rival times Yo, Yi, Y2,... are not i.i.d. but governed by a Markov chain {Jn} (we 
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assume here that the state space E is finite) in the sense that 


P(Yn Syl A) = Fily) om {In = i, Inga = j} 


where Y = o(Jo, Ji,...) and (Fiji jem is a family of distributions on (0, 00). 
A stochastic process {X;},>o is called semi-regenerative w.r.t. the Markov re- 
newal process if for any n, the conditional distribution of {Xr,+t};>ọ given 
Yo, Y1,---,;Yn,Jo,---;In—-1, Jn = i is the same as the P;-distribution of { X: }>0 
itself where P; refers to the case Jo = i. E 

A Markov renewal process {T} contains an imbedded renewal process, 
namely {T.,,} where {wp} is the sequence of instants w where J,, = io for 
some arbitrary but fixed reference state i9 E€ E. The semi-regenerative process 
is then regenerative w.r.t. {T,, }. These facts allow many definitions and results 
to be reduced to ordinary renewal- and regenerative processes. For example, the 
semi-regenerative process is called non-lattice if {T.,, } is non-lattice (it is easily 
seen that this definition does not depend on i). Further: 


Proposition A1.7 Consider a non-lattice semi-regenerative process. Assume 
that uj = E;Yo < œ for all j and that {Jn} is irreducible with stationary 


distribution (v;)j;en. Then X: 2 oo where the distribution of Xæ is given by 


; 1 ra Bs 
(Xoo) = ED vB; | g(X;) at 
0 


jEE 
where u = ice Vj Lj. 


Notes and references Renewal theory and regenerative processes are treated, 
e.g., in [APQ], Alsmeyer [45] and Thorisson [850]. 


A2 Wiener-Hopf factorization 


Let F be a distribution which is not concentrated on (—oo,0] or (0,00). Let 
X1,X9,... be iid. with common distribution F, Sn = Xı +- + Xn the 
associated random walk, and define 


T4 =inf{n>0: Sn >0}, T- =inf{n>0: Sn <0}, 


G(x) = P(S, < £, T} < œ), G-(x)=P(S,_ < z,T- < 00), 
We call 71 (7_) the strict ascending (weak descending) ladder epoch and G+ 
(G_) the corresponding ladder height distributions. 
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Probabilistic Wiener-Hopf theory deals with the relation between F, G1, 
G_, the renewal measures 


U, = 5 Gy, U= 3 G”, 
n=0 n=0 


and the T+- and 7_-pre-occupation measures 


R(A)=E = I(Sa€ A), R(A)=E I(S, € A). 
n=0 n=0 


The basic identities are the following: 

Theorem A2.1 (a) F = G} +G- -— G} *G-; 

(b) G_(A) = f F(A—2)R_(dz), AC (—co, 0]; 

(c) G,(A) = f°, F(A—2)Ry(dx), AC (0,00); 

(d) Ry =U_; (e) R- = U+. 

Proof. Considering the restrictions of measures to (—oo, 0] and (0,00), we may 
rewrite (a) as 


G(A) = F(A) + (G4 *G-)(A), 
(A) + 


Gy(A) = F(A) FGG JA; 
(e.g. (A.7) follows since Gi(A) = 0 when A C (—o0,0]). In (A.7), F(A) is the 
contribution from the event {r = 1} = {X; < 0}. On {7_ > 2}, define w as the 


time where the pre-r_ path S1,...,S7_—1 is at its minimum. More rigorously, 
we consider the last such time (to make w unique) so that 


{w=m,r_ =n} {S; -Sm 20, 0<j<m, 8;-Sm>0,m<j<n}. 


FIGURE A.1 
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Reversing the time points 0,1,...,m it follows (see Fig. A.1) that 
P(S; — Sm > 0, 0<j<m, Sm € du) = P(r =m, S Edu). 
Also, clearly 
P(S;-Sm >0, M< j <N, Sn E A| Sm E€ du) = P(T- = n-m, S,_ € A-du) 
(see again Fig. A.1). It follows that for n > 2 


P(T- =n, S;_ € A) 


= 5 P(7_ =n,w =m, Sm E du, S-_ € A) 
m=1 0 
n-1 oo 
= 5 P(T} =m, S,, € du) -P(T =n- m, dS, E€ A- u). 
m=1 0 
Summing over n = 2,3,... and reversing the order of summation yields 


m=1 n=m+1 
= P(S, € du)P(S;_ € A — du) 
0 
= (G}*G_)(A). 


Collecting terms, (A.7) follows, and the proof of (A.8) is similar. 
(b) follows from 


G+(A) = XO P(S, E A,r, =n) 


n=1 
= Se) P(S <0,0<k <n, Sp-1 € da, Xn E€ A — 2) 
n=1 0 
2 | SO F(A~a)P(S, < 0,0 < k < n, Sy-1 € da) 
0 n=1 
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and the proof of (c) is similar. For (d), consider a fixed n and let Xf = Xn—r41, 
R= XP +--+ XE = Sn — Sn-p. Then for A C (—oo, 0], 


P(S € A, >n) = P(S, <0,0<k<n,S, € A) 

= P(S% < S5_4,0<k<n,S, €A) 

= P(S% < S%,0 < k< n, 9% € A) 
(Sn < Sk,0 < k < n, Sn € A) 


is the probability that n is a weak descending ladder point with Sn € A. Sum- 
ming over n yields R,(A) = U- (A), and the proof of (e) is similar. 


Remark A2.2 In terms of m.g.f.’s, we can rewrite (a) as 


A 


1- F[s] = (1 - G4[s]) (1 - G_[s}) (A.9) 


whenever F|s], G4[s], G_[s] are defined at the same time; this always holds 
on the line R(s) = 0, and sometimes in a larger strip. Since G; is concen- 
trated on (0,00), Hi(s) = 1 — en [s] is defined and bounded in the half-plane 
{s: R(s) < 0} and non-zero in {s: R(s) < 0} (because ||G |] < 1), and simi- 
larly H_(s) = 1— G- [s] is defined and bounded in the half-plane {s : R(s) > 0} 
and non-zero in {s : Rs > 0}. The classical analytical form of the Wiener-Hopf 
problem is to write 1—-Fasa product H,, H_ of functions with such properties. 


Notes and references In its above discrete time version, Wiener-Hopf theory is 
only used at a few places in this book. However, it serves as model and motivation 
for a number of results and arguments in continuous time. E.g., the derivation of the 
form of G for the compound Poisson model (Theorem III.5.1), which is basic for 
the Pollaczeck-Khinchine formula, is based upon representing G+ as in (b), and using 
time-reversion as in (d) to obtain the explicit form of R+ (Lebesgue measure). 

In continuous time, the analogue of a random walk is a process with stationary 
independent increments (a Lévy process, cf. III.4). In this generality, there is no 
direct analogue of Theorem A2.1. For example, if {S+} is Brownian motion, then 
T4 = inf{t>0: S,;=0} is 0 a.s., and G}, G_ are trivial, being concentrated at 
0. Nevertheless, a number of related identities can be derived. An early survey is 
Bingham [168], and we further refer to Section XI.4d. 

Another main extension of the theory deals with Markov dependence. In discrete 
time, there are direct analogues of Theorem A2.1; see e.g. the survey [57] by the 
author and the extensive list of references there. Again, such developments motivate 
the approach in Chapter VII on the Markovian environment model. 

The present proof of Theorem A2.1(a) is from Kennedy [529]. 
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A3  Matrix-exponentials 


The exponential eĉ of a p x p matrix A is defined by the usual series expansion 


The series is always convergent because A” = O(n*|A|") for some integer k < p, 
where A is the eigenvalue of largest absolute value, |A| = max {|u| : u € sp(A)} 
and sp(A) is the set of all eigenvalues of A (the spectrum). 

Some fundamental properties are the following: 


sple) = {e%: A €sp(A)} (A.10) 
dl at = Ae“t = eAta (A.11) 
dt 
A] edt = e®-—I (A.12) 

0 
eA "44 _ ATAA (A.13) 


whenever A is a diagonal matrix with all diagonal elements non-zero. 

It is seen from Theorem IX.1.5 that when handling phase-type distributions, 
one needs to compute matrix-inverses Q~' and matrix-exponentials e2t (or just 
eẸ). Here it is standard to compute matrix-inverses by Gauss-Jordan elimina- 
tion with full pivoting, whereas there is no similar single established approach 
in the case of matrix-exponentials. Here are, however, three of the currently 
most widely used ones: 


Example A3.1 (SCALING AND SQUARING) The difficulty in directly applying 
the series expansion e? = Jọ Q"/n! arises when the elements of Q are large. 
Then the elements of Q"/n! do not decrease very rapidly to zero and may 
contribute a non-negligible amount to e? even when n is quite large and very 
many terms of the series may be needed (one may even experience floating point 
overflow when computing Q”). To circumvent this, write e@ = (e*)™ where 
K = Q/m for some suitable integer m (this is the scaling step). Thus, if m is 
sufficiently large, Xo K"/n! converges rapidly and can be evaluated without 
problems, and e® can then be computed as the mth power (by squaring if 
m = 2). 
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Example A3.2 (UNIFORMIZATION) Formally, the procedure consists in choos- 
ing some suitable 7 > 0, letting P = I + Q/n and truncating the series in the 
identity 


elt = ey P" (nt) (A.14) 


P-I)jt _ evnt nPt 


which is easily seen to be valid as a consequence of e@* = e”( e 

The idea which lies behind is uniformization of a Markov process {X+}, i.e. 
construction of {X;} by realizing the jump times as a thinning of a Poisson 
process {N;} with constant intensity 7. To this end, assume that Q is the 
intensity matrix for {X;} and choose 7 with 


n 2 max|qij] = max —dgi. (A.15) 


Then it is easily checked_that P is a transition matrix, and we may consider 
a new Markov process {Xa} which has jumps governed by P and occurring at 
epochs of {N;} only (note that since pi; is typically non-zero, some jumps are 
dummy in the sense that no state transition occurs). However, the intensity 
matrix Q is the same as the one Q for {X;} since a jump from i to j 4 i occurs 
at rate qij = NPij = qij- The probabilistic reason that (A.14) holds is therefore 
that the t-step transition matrix for {Xo is 


CO 


Qt Z —nt (nt)” n 
e = e =—— P 
2 n! 


(to see this, condition upon the number n of Poisson events in [0, ¢]). 


Example A3.3 (DIFFERENTIAL EQUATIONS) Letting K, = e®@', we have K = 
QK (or KQ) which is a system of p? linear differential equations which can 
be solved numerically by standard algorithms (say the Runge-Kutta method) 
subject to the boundary condition Ko = I. 

In practice, what is needed is quite often only Z, = me®t (or eth) with m 
(h) a given row (column) vector. One can then reduce to p linear differential 
equations by noting that Z = ZQ, Zo = 7 (Z = QZ, Zo =h). 

The approach is in particular convenient if one wants e@* for many different 
values of t. 


Here is a further method which appears quite appealing at a first sight: 


Example A3.4 (DIAGONALIZATION) Assume that Q has diagonal form, i.e. 
p different eigenvalues \1,...,Ap. Let V1,...,Vp be the corresponding left 
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(row) eigenvectors and h1,...,h, the corresponding right (column) eigenvec- 
tors, v;Q = AVis Qh; = A;hj. Then yih; = 0, a Æ J; and vihi Æ 0, and we 
may adapt some normalization convention ensuring vih; = 1. Then 


P P 
i=1 i=l 
P P 
e@ = Soe hiv = X e™th; © Vi. (A.17) 
i=1 i=1 


Thus, we have an explicit formula for e@* once the ;, vi, h; have been computed; 
this last step is equivalent to finding a matrix H such that H~'QH isa diagonal 
matrix, say A = (A;)aiag, and writing e@! as 


eM = HH = H (ò) gag H. (A.18) 


Namely, we can take H as the matrix with columns hy,..., hyp. 


There are, however, two serious drawbacks of this approach: 


Numerical instability : If the à; are too close, (A.18) contains terms which 
almost cancel and the loss of digits may be disasterous. The phenomenon 
occurs not least when the dimension p is large. In view of this phenomenon 
alone care should be taken when using diagonalization as a general tool for 
computing matrix-exponentials. 


Complex calculus : Typically, not all A; are real, and we need to have access 
to software permitting calculations with complex numbers or to perform 
the cumbersome translation into real and imaginary parts. 


Nevertheless, some cases remain where diagonalization may still be appeal- 
ing. 


Example A3.5 If 
qi 412 
Q=( ) 


G21 422 


is 2 x 2, the eigenvalue, say 1, of largest real part is often real (say, under the 
conditions of the Perron-Frobenius theorem), and hence Az is so because of A2 = 
tr(Q). Everything is nice and explicit here: 


_ q+ @2—-VD 
7 2 


_ aitan2+VD 


M1 5 


, A2 , where D= (q11 G22)? +4q12921- 
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Write m (= v1) for the left eigenvector corresponding to A; and k (= hı) for 
the right eigenvector. Then 


k 
m=(m T2) =a(ga 1-411), ae | qı2 i 
where a, b are any constants ensuring mk = 1, i.e. 


ab (q2q21 + (Ar — q1}? ) = 1. 


Of course, v2 and hg can be computed in just the same way, replacing A1 by 
A2. However, it is easier to note that mhs = 0 and vok = 1 implies 


v= (kz — kı), hs = ( T | 


-m 
Thus, 


Qt — rtf Tiki e ai Take cm 
=v ( Tika Take E —1 ke wk (A-19) 


Example A3.6 A particular important case arises when 


cil a 


is an intensity matrix. Then A; = 0 and the corresponding left and right eigen- 
vectors are the stationary probability distribution m and e. The other eigenvalue 


is À = Ao = —qı — q2, and after some trivial calculus one gets 
e&t = ( EN ) + e^ ( TE ] where (A.20) 
Tı T2 T Tl 


m = (mı m2) = ( 2 e ). (A.21) 


q ta q +g 


Here the first term is the stationary limit and the second term thus describes 
the rate of convergence to stationarity. 


Example A3.7 Let 


3 9 
2 14 
a 7 11 
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Then 2 
= 3 1l 97 2 
p=(-3+5) +4555) 
—3/2—11/24+5 —3/2— 11/2- 5 
pe ete a a, ee 
2 2 
9 7 3,5 if 3 7 1 
1 = ab (25+ (-14+5)") = Fab n=a(j -1435)=0(f z) 
= a 
kb 14 =p 14 , 
De > 
2 2 
SA A 
eee T 2 10 70 
miko Toko J 5 7 , 
10 10 
ge E ee 
1 
e@t — gt 0 10 4 eb 10 70 
1 7 
10 10 10 10 


A4 Some linear algebra 


4a Generalized inverses 


A generalized inverse of a matrix A is defined as any matrix A` satisfying 
AA A=A. (A.22) 


Note that in this generality it is not assumed that A is necessarily square, but 
only that dimensions match, and a generalized inverse may not be unique. 

Generalized inverses play an important role in statistics. They are most 
often constructed by imposing some additional properties, for example 


AA*tA=A, AAA = At, (AAt)'=AAt, (AtA)'=ATA. (A.23) 
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A matrix A®* satisfying (A.23) is called the Moore-Penrose inverse of A, and 
exists and is unique (see for example Rao [727]). E.g., if A is a possibly singular 
covariance matrix (non-negative definite), then there exists an orthogonal matrix 
C such that A = CDC" where 


à 0 0 
0 A2 0 
D — 
0 0 Ap 
Here we can assume that the A; are ordered such that A; > 0,...,Am > 0, 
Am+1 = ++. = Àp = 0 where m < p is the rank of A, and can define 
At 0 0 
0 Am 0 0 
A*=C| 0 0 0 0 |c. 
0 0 


In applied probability, one is also faced with singular matrices, most often 
either an intensity matrix Q or a matrix of the form I— P where P is a transition 
matrix. Assume that a unique stationary distribution m exists. Rather than 
with generalized inverses, one then works with 


Q7 = (Q-en)', (I-P) = (I-P+en)" 


(here (I — P + er)~! goes under the name fundamental matriz of the Markov 
chain). These matrices are not generalized inverses but act roughly as inverses 
except that m and e play a particular role — e.g. 


(Q-em)'Q = Q(Q-en)"! = I-en. 
Here is a typical result on the role of such matrices in applied probability: 


Proposition A4.1 Let A be an irreducible intensity matrix with stationary row 
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vector n, and define D=(A—e@®m)~'. Then for some b > 0, 


t 
| eA*dr = texr+D(e**—1) (A.24) 
0 
= ter—D+O0(e"), (A.25) 
t 2 
| re? dx = Ter +t(D +er)+ D(e* — I) — D?” (e™ — I) (A.26) 
0 
2 
= xen +tD — 2er — D+ D? + O(e™). (A.27) 


Proof. Let A(t), B(t) denote the 1.h.s. of (A.24), resp. the r.h.s. Then A(0) = 
B(0) = 0, 


B'(t) = enr + DAe™ = en +(I-—er)je™ = e = A'(t). 


(A.26) follows by integration by parts: 


cA td, = [x {xen 4 D(e4* D} f {xen + D-n) dz. 
0 0 


Finally, the formulas involving O(e~"") follow by Perron-Frobenius theory, see 
below. 


4b The Kronecker product & and the Kronecker sum © 


We recall that if A™ is a ky x mı and A? a k2 X Mz Matrix, then the Kronecker 
(tensor) product A QAP is the (kı xk2)x (mı xmz) matrix with (i1i2)(j1j2)th 
a) (2) 


entry 4;,;,@;,;,- Equivalently, in block notation (kı = mı = 2) 
Po auB ayB ) 
A = p 5 ( az B asz B ` 


Example A4.2 Let m be a row vector with m components and h a column 
vector with k components. Interpreting m,h as 1 x m and k x 1 matrices, 
respectively, it follows that h & m is the k x m matrix with ijth element hinj. 
Le. h&m reduces to hr in standard matrix notation. Note that h® a has rank 
1; the rows are proportional to m, and the columns to h, and in fact any rank 
1 matrix can be written in this form. For example, 


Cale Pe a (aa tas) 
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Example A4.3 Let 


Then 


aSpa 2/8 2/9 3V8 3v9 


A fundamental formula is 
(A,B,C) ® (A2B2C2) = (A; 8 A2)(B1 ® B2)(C1 ® C2). (A.28) 


In particular, if Ay = v1, Ao = v2 are row vectors and C1 = hy, C2 = ho are 
column vectors, then vı Bıhı and v2Bezhz2 are real numbers, and 


vı Byhy-v2Bohe = vı By hi Sv2Bohe = (11 @V2)(B, @B2)(hi h2) . (A.29) 


If A and B are both square (kı = mı and k2 = mg), then the Kronecker 
sum is defined by 


AY @A® = AV QI, + Ip, @ A. (A.30) 


A crucial property is the fact that the functional equation for the exponential 
function generalizes to Kronecker notation (note that in contrast e4+® = e4e8 
typically only holds when A and B commute): 


Proposition A4.4 e^®B = e^ @cP. 
Proof. We shall use the binomial formula 


£ 
e £ k l-k 
(A B) = DELE @ B, (A.31) 
k=0 
Indeed, 
(AGB) = (A®I+I@By 
is the sum of all products of £ factors, each of which is AQ JI or 1@ B; if AQI 


occurs k times, such a factor is A” @ B‘~* according to (A.29), and the number 
of such factors is precisely given by the relevant binomial coefficient. 
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Using (A.31), it follows that 
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Remark A4.5 Many of the concepts and results in Kronecker calculus have 
intuitive illustrations in probabilistic terms. Thus, P = PY @ PČ is the 
transition matrix of the bivariate Markov chain (x9, XO, where IXP} 


{x4 are independent Markov chains with transition matrices PY, PO, and 
Q = QVeQ® = QVearI+I19Q? (A.32) 


is the intensity matrix of the bivariate continuous Markov process { (y, YP) }, 
where yr, fy 2 are independent Markov processes with intensity matri- 
ces QV, QO; in the definition (A.32), the first term on the r.h.s. represents 
transitions in the {yy component and the second transitions in the fy) 
component, and the form of the bivariate intensity matrix reflects the fact that 
due to independence, Li, yy) cannot change state in both components 
at the same time. 

A special case of Proposition A4.4 can easily be obtained by probabilistic 
reasoning along the same lines. Let Ps, PO, P be the s-step transition ma- 


trices of fy Oye yr, resp. fy en. From what has been said about 
independent Markov chains, we have P, = PO ® P®), On the other hand, 


P, = exp{sQ} = exp {s(Q” 8 Q”)}, 


PO = exp {sQ?}, PO = exp {sQ big} . 


Taking s = 1 for simplicity, P, = P{? @ PC?) can therefore be rewritten as 


exp {QM 6 QM} = exp {QM} @ exp {Q}. 


Also the following formula is basic: 
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Lemma A4.6 Suppose that A and B are both square such thata+ 6 < 0 
whenever a is an eigenvalue of A and B is an eigenvalue of B. Let further n, v 
be any row vectors and h,k any column vectors. Then 


J nre^th -ve®'k dt = (w@v)(A® B)1(e4°?*—T(h@k). (A.33) 
0 
Proof. According to (A.29), the integrand can be written as 
(n @v)(e4* @eF\(h@k) = (w@v)(e4%?\(h@k). 
Now note that the eigenvalues of A @ B are of the form a+ @ whenever a is an 


eigenvalue of A and ( is an eigenvalue of B, so that by assumption A © B is 
invertible, and appeal to (A.12). 


4c The Perron-Frobenius theorem 


Let A be a px p-matrix with non-negative elements. We call A irreducible if the 
pattern of zero and non-zero elements is the same as for an irreducible transition 
matrix. That is, for each i, j = 1,...,p there should exist i9,71,...,%, such that 
io = i, in = j and aj,_,i, > 0 for k =1,...,n. Similarly, A is called aperiodic 
if the pattern of zero and non-zero elements is the same as for an aperiodic 
transition matrix. 

Here is the Perron-Frobenius theorem, which can be found in a great number 
of books, see e.g. [APQ, I.6] and references there: 


Theorem A4.7 Let A be ap x p-matrix with non-negative elements. Then: 
(a) The spectral radius Ao = max{|\| : A € sp(A)} is itself a strictly positive 
and simple eigenvalue of A, and the corresponding left and right eigenvectors 
v,h can be chosen with strictly positive elements; 

(b) if in addition A is aperiodic, then |A| < Ao for all A E€ sp(A), and if we 
normalize v,h such that vh = 1, then 


A” = \jhv+O(u") = AGSh@v+O(L") (A.34) 
for some u € (0, ào). 


Note that for a transition matrix, we have ào = 1, h = e and v = ~v (the 
stationary row vector). 

The Perron-Frobenius theorem has an analogue for matrices B with proper- 
ties similar to intensity matrices: 
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Corollary A4.8 Let B be an irreducible? p x p-matrix with non-negative off- 
diagonal elements. Then the eigenvalue Ao with largest real part is simple and 
real, and the corresponding left and right eigenvectors v,h can be chosen with 
strictly positive elements. Furthermore, if we normalize v,h such that vh = 1, 
then 

ePt = thv + Ole") = ee*h gv + O(t) (A.35) 


for some u E€ (—o0, ào). 


Note that for an intensity matrix, we have ào = 0, h = e and v = mw (the 
stationary row vector). 

Corollary A4.8 is most often not stated explicitly in textbooks (but see 
APQ, II.4d] for intensity matrices!), but is an easy consequence of the Perron- 
Frobenius theorem. For example, one can consider A = nI+B where 7 > 0 is so 
large that all diagonal elements of A are strictly positive (then A is irreducible 
and aperiodic), relate the eigenvalues of B to those of B via (A.10) and use the 
formula 


oo 
Ant” 
eBt NEA e7nte^t — ey 
n! 
n=0 


(cf. the analogy of this procedure with uniformization, Example A3.2). 


A5 Complements on phase-type distributions 


5a Asymptotic exponentiality 


In Proposition IX.1.8, it was shown that under mild conditions the tail of a 
phase-type distribution B is asymptotically exponential. The next result gives 
a condition for asymptotic exponentiality, not only in the tail but in the whole 
distribution. The content is that B is approximately exponential if the exit rates 
t; are small compared to the feedback intensities t;; (i 4 j). To this end, note 
that we can write the phase generator T as Q — (tj) diag Where Q = T + (ti)diag 
is a proper intensity matrix (Qe = 0). Le. the condition is that t is small 
compared to Q. 


Proposition A5.1 Let Q be a proper irreducible intensity matrix with station- 
ary distribution a, let t = (ti)ice # 0 have non-negative entries and define 
T® = aQ—(ti)diag. Then for any B, the phase-type distribution B® with repre- 
sentation (B, T®) is asymptotically exponential with parameter t* = Do icp Qiti 


as a > œ, B (x) ete, 


2By this, we mean that the pattern of non-zero off-diagonal elements is the same as for an 
irreducible intensity matrix. 
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Proof. Let {JO} be the phase process associated with B®) and ¢ its life- 
length, let fy be a Markov process with initial distribution œ and intensity 
matrix aQ, and write Y, = Y”, ¢ = ¢® ete. We can assume that JO = YO, 
t < ¢, and that yo = Y,, for all t. Let further V be exponential with in- 
tensity V and independent of everything else. We can think of ¢( as the first 
event in an inhomogeneous Poisson process (Cox process) with intensity process 
{tise or Hence we can represent ¢ as 


t t 
6 = inf{t>0: | ty dv =V} = inf{t>0: | ty,, dv =V} 
oo * 0 
at 1 
= inf {t >0: fi ty, dv = av} = -oa(aV), 
0 a 
where o(x) = inf{t > 0: j ty, dv = x}. By the law of large numbers for 


Markov processes, a ty, dv/t S t*, and this easily yields o(a)/a “4 1/t*. 
Hence ¢() S V/t*. 


We shall, in fact, prove a somewhat more general result which was used in 
the proof of Proposition VII.1.9. In addition to the asymptotic exponentiality, 
it states that the state, from which the phase process is terminated, has a limit 
distribution: 


Proposition A5.2 P;(¢( > 2, dee aes 


Proof. Assume first t; > 0 for all i and let J, = Yj(z). Then {J} is a Markov 
process with Jo = Yo. Conditioning upon whether {Y;} changes state in [0, dx/t;] 
or not, we get 


f dx dx 
Pi(Iaz = j) = (1+ gis )dig + di G—(1 dis) 


Hence the intensity matrix of {Is} is (qij/ti)i jem, from which it is easily checked 
that the limiting stationary distribution is (a;t;/t*)ic pr. 

Now let a’ — co with a in such a way that a’ < a, a’/a > 1, a — a’ > œ 
(e.g. a’ = a — af where 0 < e < 1). Then o(a'V)/a(aV) * 1. Since 


shee a? Y T Yago) = Yoav): 
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it follows that 
P,(¢ > x, TE = j) 
a(aV) 

= Pi 


a(a'V) : 
~ p,(“—— > 2,Yo(av) = j) 


+ > 2)\P(Yotav) = | Fowv))] 


ee ee 


Reducing the state space of {I,} to {i € E: t; > 0}, an easy modification of 
the argument yields finally the result for the case where t; = 0 for one or more i. 


Notes and references Propositions A5.1 and A5.2 do not appear to be in the 
literature. However, these results are in the spirit of rare events theory for regenerative 
processes (e.g. Keilson [523], Gnedenko & Kovalenko [420] and Glasserman & Kou 
[418]). See also Korolyuk, Penev & Turbin [555]. 


5b Discrete phase-type distributions 


The theory of discrete phase-type distributions is a close parallel of the contin- 
uous case, so we shall be brief. 

A distribution B on {1,2,...} is said to be discrete phase-type with represen- 
tation (E, P, a) if B is the lifelength of a terminating Markov chain (in discrete 
time) on Æ which has transition matrix P = (p;;) and initial distribution a. 
Then P is substochastic and the vector of exit probabilities is p = e — Pe. 


Example A5.3 As the exponential distribution is the simplest continuous phase- 
type distribution, so is the geometric distribution, with point probabilities bx = 
(1 —p)*-!p, k = 1,2,..., the simplest discrete phase-type distribution: here Æ 
has only one element, and thus the parameter p of the geometric distribution 
can be identified with the exit probability vector p. 


Example A5.4 Any discrete distribution B with finite support, say bk = O, 


and 


of 1 k>l,j=k-1, _f 1 k=1 
Pkj = 0 otherwise, > Be 0 k>l 
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Theorem A5.5 Let B be discrete phase-type with representation (P,a). Then: 
(a) The point probabilities are by = aP*~'p; 

(b) the generating function oz] = op, 2"bp is za(I — zP)~1p; 

(c) the nth moment Xz] k™ by is (—1)"n!aP-"p. 


5c Closure properties 


Example A5.6 (CONVOLUTIONS) Let Bı, B2 be phase-type with represen- 
tations (EO ,a®,T), resp. (E@,a®2),T). Then the convolution B = 
B, * Bə is phase-type with representation (E,a,T) where E = E® + E®) is 
the disjoint union of E and E®), and 
qd). (1) a) 4(1),(2) 
_jJaz’, icE _(T ta 
ato ic BO)’ ra 0 T® ) (ei 


in block-partitioned notation (where we could also write œ as (a) 0)). A 
reduced phase diagram (omitting transitions within the two blocks) is 


(1) (1) (2) (2) 
Ot pO t Os po t A 


FIGURE A.2 


The form of these results is easily recognized if one considers two indepen- 
dent phase processes TIPY, JP} with lifetimes U1, resp. U2, and piece the 
processes together by 


JP, 0<t<U 
J= T Uı <t < Ui +U2 
A, t> Ui +U. 


Then {J+} has lifetime U; + U2, initial distribution a and phase generator T. 


Example A5.7 (THE NEGATIVE BINOMIAL DISTRIBUTION) The most trivial 
special case of Example A5.6 is the Erlang distribution Æ, which is the convo- 
lution of r exponential distributions. The discrete counterpart is the negative 
binomial distribution with point probabilities 


— k-1 k-r,r n 
w=( 5] 0-9 p, k=r,r+1,.... 
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This corresponds to a convolution of r geometric distributions with the same 
parameter p, and hence the negative binomial distribution is discrete phase-type, 
as is seen by minor modifications of Example A5.6. 


Example A5.8 (FINITE MIXTURES) Let Bı, B2 be phase-type with repre- 
sentations (ED) ,a®,T), resp. (E@),a),T). Then the mixture B = 
6B, + (1 — 0) Bo (0 < 6 < 1) is phase-type with representation (E, œ, T) where 
E = E™ + E®) is the disjoint union of E® and B®), and 


ba, ic EO) TX o 
- (ea, iero fr T =(y ro) sn 


(in block-partitioned notation, this means that a = (da) (1—6@)a®))). A 
reduced phase diagram is 


ba) 


e 
Gat") BO P 


FIGURE A.3 


In exactly the same way, a mixture of more than two phase-type distributions 
is seen to be phase-type. In risk theory, one obvious interpretation of the claim 
size distribution B to be a mixture is several types of claims. 


Example A5.9 (INFINITE MIXTURES WITH T FIXED) Assume that a = a” 
depends on a parameter a € A whereas E and T are the same for all a. Let B® 
be the corresponding phase-type distribution, and consider BU) = Ja B yda) 
where v is a probability measure on A. Then it is trivial to see that B™ is 
phase-type with representation (a, T; E) where al”) = Ja a v(da). 


Example A5.10 (GEOMETRIC COMPOUNDS) Let B be phase-type with repre- 
sentation (E, aœ, T) and C = X% (1 — p)p” -1B*". Equivalently, if U1, U2,... 
are i.i.d. with common distribution and N is independent of the Ux, and geo- 
metrically distributed with parameter p, P(N = n) = (1 — p)p”"~1, then C is 
the distribution of Ui +---+ Uy. To obtain a phase process for C, we need 
to restart the phase process for B w.p. p at each termination. Thus, a reduced 
phase diagram is 
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ae 
S| EB 1 A 


FIGURE A.4 


and C is phase-type with representation (E,a,T + pta). Minor modifications 
of the argument show that 


1. If U; has a different initial vector, say v, but the same T, then U,+---+Un 
is phase-type with representation (E, v, T + pta); 


2. if B is defective and N +1 is the first n with U, = oo, then U1 +---+Uy 
is zero-modified phase-type with representation (a, T + ta, E). Note that 
this was exactly the structure of the lifetime of a terminating renewal 
process, cf. Corollary IX.2.2. 


Example A5.11 (OVERSHOOTS) The overshoot of U over x is defined as the 
distribution of (U — x)*. It is zero-modified phase-type with representation 
(E, ae?*,T) if U is phase-type with representation (E, œ, T). Indeed, if {J} is 
a phase process for U, then J, has distribution aef”. 

If we replace x by a r.v. X independent of U, say with distribution F, it 
follows by mixing (Example A5.9) that (U — X)* is zero-modified phase-type 
with representation (E, aF[T],T) where 


FIT] = if ng F(dz) 


is the matrix m.g.f. of F, cf. Proposition IX.1.7. 


Example A5.12 (PHASE-TYPE COMPOUNDS) Let fı, fo,... be the point prob- 
abilities of a discrete phase-type distribution with representation (E, a, P), let 
B be a continuous phase-type distribution with representation (F, v, T) and C = 
yr fn B*”. Equivalently, if U;,U2,... are i.i.d. with common distribution B 
and N is independent of the Up with P(N = n) = fn, then C is the distribution 
of U; +---+Uy. To obtain a phase representation for C, let the phase space 
be Ex F = {ij : i € E,j € F}, let the initial vector be a® v and let the phase 
generator be I 8 T + P ®& (ta). 


Example A5.13 (MINIMA AND MAXIMA) Let U1, Uz be random variables with 
distributions Bı, Bz of phase-type with representations (E™,a@,T), resp. 
(EB, a, TP). Then the minimum U ^ Uz and the maximum U, V Uz are 
again phase-type. 
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To see this, let TIS}, TIO be independent with lifetimes U,, resp. U32. 
For U; A U2, we then let the governing phase process be {J+} = f(a”, J®) y 
interpreting exit of either of haa cha as exit of {J+}. Thus the represen- 
tation is 

(EY x E? a ga®, TY g T)), 
For U; V U2, we need to allow (yan to go on (on E?)) when 1I) exits, and 


vice versa. Thus the state space is E® x B®) U BE U ECO), the initial vector 
is (a) @ a?) 0 0), and the phase generator is 


TO) ® Te) pa a t2) 4 a TC?) 
0 TY 0 
0 0 T?) 


Notes and references The results of the present section are standard, see Neuts 
[660] (where the proof, however, relies more on matrix algebra than the probabilistic 
interpretation exploited here). 


5d _ Phase-type approximation 


A fundamental property of phase-type distributions is denseness. That is, any 
distribution B on (0,00) can be approximated ‘arbitrarily close’ by a phase-type 
distribution B: 


Theorem A5.14 To a given distribution B on (0,00), there is a sequence {Bn} 


of phase-type distributions such that Bn 2 Basn— oo. 


Proof. Assume first that B is a one-point distribution, say degenerate at b, and 
let Bn be the Erlang distribution En(ôn) with 6, = n/b. The mean of Bn is 


n/ôn = b and the variance is n/52 = b?/n. Hence it is immediate that Bn 2p 

The general case now follows easily from this, the fact that any distribution 
B can be approximated arbitrarily close by a distribution with finite support, 
and the closedness of the class of phase-type distributions under the formation 
of finite mixtures, cf. Example A5.8. Here are the details at two somewhat 
different levels of abstraction: 


(diagonal argument, elementary) Let {bx} be any dense sequence of con- 
tinuity points for B(x). Then we must find phase-type distributions Bn 
with Bn(bk) — B(b,) for all k. Now we can find first a sequence {Dm} 
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of distributions with finite support such that Dn (bk) > B(b,) for all k 
as n — oo. By the diagonal argument (subsequent thinnings), we can 
assume that | Dn (bx) — B(bx)| < 1/n for n > k. Let the support of Dn be 
{x1(n),...,€q(n)(n)}, with weight p;(n) for x;(n). Then from above, 


a(n) A 9 a(n) 
A 2 Pin) Ee (=) = 2 Pm boc re age he 
Hence we can choose r(n) in such a way that 


1 
|Cr(n),n(be) — D(br)| < 


a) 
n 


Then 7 
|Crín),n(br) — B(bx)| < PE k<n, 


and we can take Bn = Cr(n),n- 


(abstract topological) The essence of the argument above is that the closure 
(w.r.t. the topology for weak convergence) YHT of the class YHT of 
phase-type distributions contains all one-point distributions. Since YHT 
is closed under the continuous operation of formation of finite mixtures, 
HT contains all finite mixtures of one-point distributions, i.e. the class 
SY, of all discrete distributions. But A is the class -Z of all distributions 
on [0,co). Hence YC PHT and Y= PHT. 


Theorem A5.14 is fundamental and can motivate phase-type assumptions, 
say on the claim size distribution B in risk theory, in at least two ways: 


insensitivity Suppose we are able to verify a specific result when B is of phase- 
type say that two functionals yi(B) and y2(B) coincide. If y;(B) and 
(p2(B) are weakly continuous, then it is immediate that yi(B) = yo(B) 
for all distributions B on [0, 00). 


approximation Assume that we can compute a functional y(B) when B is 
phase-type, and that y is known to be continuous. For a general Bo, we 
can then approximate By by a phase-type B, compute y(B) and use this 
quantity as an approximation to y(Bo). In particular, if information on 
Bo is given in terms of observations (i.i.d. replications), one would use the 
B given by some statistical fitting procedure (see below). 
It should be noted, however, that this procedure should be used with 
care if y(B) is the ruin probability (u) and u is large. 


544 APPENDIX 
Let & be the class of functions f : [0, o0) — [0,00) such that f(x) = O(e%”), 
x — oo, for some a < co. 


Corollary A5.15 To a given distribution B on (0,00) and any fi, fa,... € E, 


there is a sequence { Bn} of phase-type distributions such that Bn 2 B asn — œ 


and f° fi(z)By(da) > f fi(£)B(dz), i =1,2,... 


Proof. By Fatou’s lemma, Bn 2 B implies that 


lim inf | “dan tay e f BOA 


for each i, and hence it is sufficient to show that we can obtain 


imal Pepa l ROBE o i=1,2,.... A.38 
msup | fi(v)Ba(ae) < f f(e)B(ae) (A.38) 

We first show that for each f € &, 
B=6,, B= En —> ý = z). 
Be = fu f(t) By (de) f #@Btae) f) 
(A.39) 


Indeed, if f(x) = e°”, then 


[rona (a) = (Gey + ero- [rest 


and the case of a general f then follows from the definition of the class € and a 
uniform integrability argument. 

Now returning to the proof of (A.38), we may assume that in the proof of 
Theorem A5.14 Dp has been chosen such that 


[ ODAN (1+4) | HoBao, EEATT 


y (A.39), 7 
T fil#)Crn(der) > J fi(@)Da (d2), 
0 0 


and hence we may choose r(n) such that 


Feng ae) < T ” F(a) B(az), ete ne 
0 0 
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Corollary A5.16 To a given distribution B on (0,00), there is a sequence 


{Bn} of E ies such that Bn Z B as n => œ and all moments 
converge, f r’'B Te z’ B(dz), i= 1,2,... 


In compound Poisson risk processes with arrival intensity @ and claim size 
distribution B satisfying Gu, < 1, the adjustment coefficient y = y(B, 8) is 
defined as the unique solution > 0 of Bh = 147/86. The adjustment coefficient 
is a fundamental quantity, and therefore the following result is highly relevant 
as support for phase-type assumptions in risk theory: 


Corollary A5.17 To a given 3 >0 and a given distribution B on (0,00) with 
Bly +€] < œ for some € > y = 7(B, 3), there is a sequence {Bn} of phase-type 


distributions such that Bn 2 B as n —> œ and yy, —> y where yn = y(Bn, B). 
Proof. Let fi(x) = e+)® for some sequence {e;} with e; € (0, €) and e; | 0 as 
i — oo. If e; > 0, then 


Bily te] > Bh tel > Er 
implies that yn < y+ €; for all sufficiently large n. I.e. limsup n < y. liminf > 
is proved similarly. 


We state without proof the following result: 


Corollary A5.18 In the setting of Corollary A5.16, one can obtain (Bn, B) 
=y foralln. 


Notes and references Theorem A5.14 is classical; the remaining results may be 
slightly stronger than those given in the literature, but are certainly not unexpected. 


5e Phase-type fitting 


As has been mentioned a number of times already, there is substantial advantage 
in assuming the claim sizes to be phase-type when one wants to compute ruin 
probabilities. For practical purposes, the problem thus arises of how to fit a 
phase-type distribution B to a given set of data 1,...,Cmn. The present section 
is a survey of some of the available approaches and software for inplementing 
this. 

We shall formulate the problem in the slightly broader setting of fitting a 
phase-type distribution B to a given set of data Q1, ...,Çmn ora given distribution 
Bo. This is motivated in part from the fact that a number of non-phase-type 
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distributions like the lognormal, the loggamma or the Weibull have been argued 
to provide adequate descriptions of claim size distributions, and in part from 
the fact that many of the algorithms that we describe below have been formu- 
lated within the set-up of fitting distributions. However, from a more conceptual 
point of view the two sets of problems are hardly different: an equivalent rep- 
resentation of a set of data ¢,...,¢n is the empirical distribution Be, giving 
mass 1/N to each ¢;. 

Of course, one could argue that the results of the preceding section concern- 
ing phase-type approximation contains a solution to our problem: given Bo (or 
Be), we have constructed a sequence {B,,} of phase-type distribution such that 


By = Bo, and as fitted distribution we may take Bn for some suitable large n. 
The problem is that the constructions of {B,,} are not economical: the number 
of phases grows rapidly, and in practice this sets a limitation to the usefulness 
(the curse of dimensionality, we do not want to perform matrix calculus in 
hundreds or thousands dimensions). 

A number of approaches restrict the phase-type distribution to a suitable 
class of mixtures of Erlang distributions. The earliest such reference is Bux 
& Herzog [211] who assumed that the Erlang distributions have the same rate 
parameter, and used a non-linear programming approach. The constraints were 
the exact fit of the two first moments and the objective function to be minimized 
involved the deviation of the empirical and fitted c.d.f. at a number of selected 
points. In a series of papers (e.g. [509]), Johnson & Taaffe considered a mixture 
of two Erlangs (with different rates) and matched (when possible) the first three 
moments. Schmickler (the MEDA package; e.g. [767]) has considered an exten- 
sion of this set-up, where more than two Erlangs are allowed and in addition to 
the exact matching of the first three moments a more general deviation measure 
is minimized (e.g. the Lı distance between the c.d.f.’s). 

The characteristics of all of these methods is that even the number of pa- 
rameters may be low (e.g. three for a mixture of two Erlangs), the number of 
phases required for a good fit will typically be much larger, and this is what 
matters when using phase-type distributions as computational vehicle in say 
renewal theory, risk theory, reliability or queueing theory. It seems therefore a 
key issue to develop methods allowing for a more general phase diagram, and we 
next describe two such approaches which also have the feature of being based 
upon the traditional statistical tool of maximum likelihood. 

A method developed by Bobbio and co-workers (see e.g. [179]) restricts at- 
tention to acyclic phase-type distributions, defined by the absence of loops in 
the phase diagram. The likelihood function is maximized by a local linearization 
method allowing to use linear programming techniques. 

Asmussen & Nerman [91] implemented maximum likelihood in the full class 
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of phase-type distributions via the EM algorithm; a program package written in 
C for the SUN workstation or the PC is available as shareware, cf. [476]. The 
observation is that the statistical problem would be straightforward if the whole 


—(k 
(E'a-valued) phase process {J l pA <t<Cy associated with each observation Çk was 
available. In fact, then the estimators would be of simple occurrence-exposure 


type, 


N s(k) 
7 = Yeni Ts =) tij = = i E, j Ex, 
where 
N Cr N 
T; = > | (IP =), My => YD (IP =4, FY = 35) 
k=1°9 k=1t€[0,Cx] 


(T; is the total time spent in state i and N;j is the total number of jumps 
from i to j). The general idea of the EM algorithm ([291]) is to replace such 
unobserved quantities by the conditional expectation given the observations; 
since this is parameter-dependent, one is led to an iterative scheme, e.g. 


G#k), 


rt) _ Sam r (NyxlG1, +++, Cn) 
i Sao r (Ti [G1 --- CN) 


and similarly for the aft), The crux is the computation of the conditional 


expectations. E.g., it is easy to see that 


faim) r (Tilin) = 


aoo 
an) Tn) [/ a” =i)dt| Ge 
0 


JE MT BE, eFeT Ce) dr 


4 


aM eT ktn) 


M= iM= 


> 
Il 
= 


and this and similar expressions are then computed by numerical solution of a 
set of differential equations. 

In practice, the methods of [179] and [91] appear to produce almost identical 
results. Thus, it seems open whether the restriction to the acyclic case is a 
severe loss of generality. 

Yet a third method, implemented in Bladt & Lauritzen [172], is based on 
Markov chain Monte Carlo where the main computational step is based on 
simulation. 
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A6 ‘Tauberian theorems 


The following classical results (see e.g. Bingham, Goldie & Teugels [169, Th.1.7.1 
and Th.8.1.6] on regularly varying tails are often useful for asymptotic results 
on ruin probabilities y(u) for u — co when some information on the behavior 
of its Laplace transform 4) [—s] for s — 0 is available. 


Theorem A6.1 Let U be a non-decreasing right-continuous function on R with 
U(x) =0 for x <0 and denote by U[—s] = te e §*dU(x) its Laplace-Stieltjes 
transform. If L(x) is a slowly varying function and c > 0,a > 0, then the 
following two assertions are equivalent: 


(i) U(x) ~ cx L(2)/T(l+a), z= o, 
(ii) O[-s] ~ cs~*L(1/s), s} 0. 
Consider next a positive r.v. X with c.d.f. F, Laplace-Stieltjes transform 


F[—s| = fg edF (x) and un = E(X”). Denote 


fuls) = Dr (Pis - BO) 
0 


and af 
By ee in RO aI 


ds” 


In particular, fo(s) = go(s) = 1 — F|-s]. 


Theorem A6.2 Let L(x) be a slowly varying function and un < co. Write 
a=n+n with0 <n <1. Then the following assertions are equivalent: 


G) fals) ~ s*L(1/s), 810, 


Gi) gn(s) ~ FEER s"L(1/s), s10, 


Gii) f° td F(t) ~ n! L(x), xz — œ, when n = 0, 
1—F(a)~ tao x * L(x), x— oœ, when0<n <1, 
fo ttdF(t) ~ (n+ 1)! L(x), 2 00, whenn =1. 


For n > 0 a further equivalent statement is 


T(a+1) 


PORO Naar 


gi Essie 10: 
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